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NOVEL NUCLEIC ACIDS AND SECRETED 
POLYPEPTIDES 



1. CROSS REFERENCE TO RELATED APPLICATIONS 

5 This application is a continuation-in-part application of U.S. Application Serial No. 

09/552,317 filed April 25, 2000 entitled "Novel Contigs Obtained from Various Libraries", 
Attorney Docket No. 784C1P, which in turn is a continuation-in-part application of U.S. 
Application Serial No. 09/488,725 filed January 21, 2000 entitled "Novel Contigs Obtained 
from Various Libraries", Attorney Docket No. 784; U.S. Application Serial No. 09/491,404 

10 filed January 25, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney 
Docket No. 785; U.S. Application Serial No. 09/560,875 filed April 27, 2000 entitled "Novel 
Contigs Obtained from Various Libraries", Attorney Docket No. 787CIP, which in turn is a 
continuation-in-part application of U.S. Application Serial No. 09/496,914 filed February 03, 
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 787; 

15 U.S. Application Serial No. 09/577,409 filed May 18, 2000 entitled "Novel Contigs 
Obtained from Various Libraries", Attorney Docket No. 788CJJP, which in turn is.a 
continuation-in-part application of U.S. Application Serial No. 09/515,126 filed February 28, 
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 788; 
U.S. Application Serial No. 09/574,454 filed May 19, 2000 entitled "Novel Contigs 

20 Obtained from Various Libraries", Attorney Docket No. 789CIP which in turn is a 

continuation-in-part application of U.S. Application Serial No. 09/519,705 filed March 07, 
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 789; 
U.S. Application Serial No. 09/649,167 filed August 23, 2000 entitled "Novel Contigs 
Obtained from Various Libraries", Attorney Docket No. 790CIP, which in turn is a 

25 continuation-in-part application of U.S. Application Serial No. 09/540,217 filed March 31, 
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 790; 
U.S. Application Serial No. 09/770,160 filed January 26, 2001 entitled 'TSTovel Contigs 
Obtained from Various Libraries", Attorney Docket No. 791CEP, which is in turn a 
continuation-in-part application of U.S. Application Serial No. 09/552,929 filed April 18, 

30 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 791; 
and U.S. Application Serial No. 09/577,408 filed May 18, 2000 entitled "Novel Contigs 
Obtained from Various Libraries", Attorney Docket No. 792; all of which are incorporated 
herein by reference in their entirety. 
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2. BACKGROUND OF THE INVENTION 

2.1 TECHNICAL FIELD 

5 The present invention provides novel polynucleotides and proteins encoded by such 

polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2.2 BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such 
as lymphokines, interferons, circulating soluble factors, chemokines, and interleukins) has 
matured rapidly over the past decade. The now routine hybridization cloning and expression 
cloning techniques clone novel polynucleotides "directly" in the sense that they rely on 
information directly related to the discovered protein (i.e., partial DNA/amino acid sequence 
of the protein in the case of hybridization cloning; activity of the protein in the case of 
expression cloning). More recent "indirect" cloning techniques such as signal sequence 
cloning, which isolates DNA sequences based on the presence of a now well-recognized 
secretory leader sequence motif, as well as various PCR-based or low stringency 
hybridization-based cloning techniques, have advanced the state of the art by making 
available large numbers of DNA/amino acid sequences for proteins that are known to have 
biological activity, for example, by virtue of their secreted nature in the case of leader 
sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, 
for example, diagnostics, forensics, gene mapping; identification of mutations responsible 
for genetic disorders or other traits, to assess biodiversity, and to produce many other types 
of data and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

30 The compositions of the present invention include novel isolated polypeptides, novel 

isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
cloned genes or degenerate variants thereof especially naturally occurring variants such as 
allelic variants, antisense polynucleotide molecules, and antibodies that specifically recognize 
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one or more epitopes present on such polypeptides, as well as hybridomas producing such 
antibodies. 

The compositions of the present invention additionally include vectors, including 
expression vectors, cont ainin g the polynucleotides of the invention, cells genetically engineered 
5 to contain such polynucleotides and cells genetically engineered to express such 
polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public 

1 0 databases. The invention relates also to the proteins encoded by such polynucleotides, along 
with therapeutic, diagnostic and research utilities for these polynucleotides and proteins. These 
nucleic acid sequences are designated as SEQ ID NO: 1-1041, or 2083-2534 and are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenine; C 
is cytosine; G is guanine; T is thymine; andN is any of the four bases or unknown. In the 

1 5 amino acids provided in the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences 
that hybridize to the complement of SEQ ID NO: 1-1041, or 2083-2534 under stringent 
hybridization conditions; nucleic acid sequences which are allelic variants or species 
homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 

20 encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ 
ID NO: 1-1041, or 2083-2534. A polynucleotide comprising a nucleotide sequence having at 
least 90% identity to an identifying sequence of SEQ ID NO: 1-1041, or 2083-2534 or a 
degenerate variant or fragment thereof. The identifying sequence can be 1 00 base pairs in 
length. 

25 The nucleic acid sequences of the present invention also include the sequence 

information from the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-2534. The 
sequence information canbe a segment of any one of SEQ ID NO: 1-1041, or 2083-2534 that 
uniquely identifies or represents the sequence information of SEQ ID NO: 1-1041, or 2083- 
2534. 

30 A collection as used in this application can be a collection of only one polynucleotide. 

The collection of sequence information or identifying information of each sequence can be 
provided on a nucleic acid array. In one embodiment, segments of sequence information are 
provided on a nucleic acid array to detect the polynucleotide that contains the segment. The 
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array can be designed to detect full-match or mismatch to the polynucleotide that contains the 
segment. The collection can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; 
and host cells or organisms transformed with these expression vectors. Nucleic acid sequences 
(or their reverse or direct complements) according to the invention have numerous applications 
in a variety of techniques known to those skilled in the art of molecular biology, such as use as 
hybridization probes, use as primers for PCR, use in an array, use in computer-readable media, 
use in sequencing full-length genes, use for chromosome and gene mapping, use in the 
recombinant production of protein, and use in the generation of anti-sense DNA or RNA, their 
chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083- 
2534 or novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art. In a particularly preferred embodiment, the 
nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-2534 or novel segments or parts of the 
nucleic acids provided herein are used in diagnostics for identifying expressed genes or, as well 
known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed 
sequence tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1- 
1041, or 2083-2534; a polynucleotide comprising any of the full length protein coding 
sequences of SEQ ID NO: 1-1041, or 2083-2534; and a polynucleotide comprising any of the 
nucleotide sequences of the mature protein coding sequences of SEQ ID NO: 1-1041, or 2083- 
2534. The polynucleotides of the present invention also include, but are not limited to, a 
polynucleotide that hybridizes under stringent hybridization conditions to (a) the complement of 
any one of the nucleotide sequences set forth in SEQ ID NO: 1-1041, or 2083-2534; (b) a 
nucleotide sequence encoding any one of the amino acid sequences set forth in SEQ ID NO: 1- 
1041, or 2083-2534; (c) a polynucleotide which is an allelic variant of any polynucleotides 
recited above; (d) a polynucleotide which encodes a species homolog (e.g. orthologs) of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of any of the polypeptides comprising an amino acid sequence set 
forth in SEQ ID NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8. 
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The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the 
corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides with biological activity that are encoded by (a) any of the polynucleotides having 
5 a nucleotide sequence set forth in SEQ ID NO: 1-1041, or 2083-2534; or (b) polynucleotides 
that hybridize to the complement of the polynucleotides of (a) under stringent hybridization 
conditions. Biologically active variants of any of the polypeptide sequences in the Sequence 
Listing, and "substantial equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 
85%, 90%, 95%, 98% or 99% amino acid sequence identity) that preferably retain biological 
1 0 activity are also contemplated. The polypeptides of the invention may be wholly or partially 

chemically synthesized but are preferably produced by recombinant means using the genetically 
engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such 
15 as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a 
polynucleotide of the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
20 under conditions permitting expression of the desired polypeptide, and purifying the 

polypeptide from the culture or from the host cells. Preferred embodiments include those in 
which the protein produced by such processes is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology. These techniques 
25 include use as hybridization probes, use as oligomers, or primers, for PCR, use for 

chromosome and gene mapping, use in the recombinant production of protein, and use in 
generation of anti-sense DNA or RNA, their chemical analogs and the like. For example, 
when the expression of an mRNA is largely restricted to a particular cell or tissue type, 
polynucleotides of the invention can be used as hybridization probes to detect the presence 
30 of the particular cell or tissue mRNA in a sample using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
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exemplified by Vollrath et al, Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a 
5 polypeptide of the invention can be used to generate an antibody that specifically binds the 
polypeptide. Such antibodies, particularly monoclonal antibodies, are useful for detecting or 
quantitating the polypeptide in tissue. The polypeptides of the invention can also be used as 
molecular weight markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical 

10 condition which comprises the step of administering to a mammalian subject a 

therapeutically effective amount of a composition comprising a polypeptide of the present 
invention and a pharmaceutical^ acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, 
for example, in methods for the prevention and/or treatment of disorders involving aberrant 

1 5 protein expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for 
example, be utilized as part of prognostic and diagnostic evaluation of disorders as recited 
herein and for the identification of subjects exhibiting a predisposition to such conditions. 

20 The invention provides a method for detecting the polynucleotides of the invention in a 
sample, comprising contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of interest for a period sufficient to form the complex and 
under conditions sufficient to form a complex and detecting the complex such that if a 
complex is detected, the polynucleotide of interest is detected. The invention also provides a 

25 method for detecting the polypeptides of the invention in a sample comprising contacting the 
sample with a compound that binds to and forms a complex with the polypeptide under 
conditions and for a period sufficient to form the complex and detecting the formation of the 
complex such that if a complex is formed, the polypeptide is detected. 

The invention also provides kits comprising polynucleotide probes and/or 

30 monoclonal antibodies, and optionally quantitative standards, for carrying out methods of the 
invention. Furthermore, the invention provides methods for evaluating the efficacy of drugs, 
and monitoring the progress of patients, involved in clinical trials for the treatment of 
disorders as recited above. 
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The invention also provides methods for the identification of compounds that 
modulate (i.e., increase or decrease) the expression or activity of the polynucleotides and/or 
polypeptides of the invention. Such methods can be utilized, for example, for the 
identification of compounds that can ameliorate symptoms of disorders as recited herein. 

5 Such methods can include, but are not limited to, assays for identifying compounds and 
other substances that interact with (e.g., bind to) the polypeptides of the invention. The 
invention provides a method for identifying a compound that binds to the polypeptides of the 
invention comprising contacting the compound with a polypeptide of the invention in a cell 
for a time sufficient to form a polypeptide/compound complex, wherein the complex drives 

1 0 expression of a reporter gene sequence in the cell; and detecting the complex by detecting 
the reporter gene sequence expression such that if expression of the reporter gene is detected 
the compound that binds to a polypeptide of the invention is identified. 

The methods of the invention also provide methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals 

1 5 exhibiting symptoms or tendencies. In addition, the invention encompasses methods for 
treating diseases or disorders as recited herein comprising administering compounds and 
other substances that modulate the overall activity of the target gene products. Compounds 
and other substances can affect such modulation either on the level of target gene/protein 
expression or target protein activity. 

20 The polypeptides of the present invention and the polynucleotides encoding them are 

also useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family 
(as set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides 

25 and polynucleotides of the present invention are useful for a variety of applications, as 
described herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 



30 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms 
"a", "an" and "the" include plural references unless the context clearly dictates otherwise. 



WO 03/080795 



PCT7US02/25485 



8 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 

5 Likewise "immunologically active" or "immunological activity" refers to the capability of 
the natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are 
engaged in extracellular or intracellular membrane trafficking, including the export of 

1 0 secretory or enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded 
molecules may be "partial" such that only certain portion(s) of the nucleic acids bind or it 

1 5 may be "complete" such that total complementarity exists between the single stranded 

molecules. The degree of complementarity between the nucleic acid strands has significant 
effects on the efficiency and strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ 

20 line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a 
steady and continuous source of germ cells for the production of gametes. The term 
"primordial germ cells (PGCs)" refers to a small population of cells set aside from other cell 
lineages particularly from the yolk sac, mesenteries, or gonadal ridges during embryogenesis 
that have the potential to differentiate into germ cells and other cells. PGCs are the source 

25 from which GSCs and ES cells are derived. The PGCs, the GSCs and the ES cells are • 
capable of self-renewal. Thus these cells not only populate the germ line and give rise to a 
plurality of terminally differentiated cells that comprise the adult specialized organs, but are 
able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides 

30 which modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. 
EMFs include, but are not limited to, promoters, and promoter modulating sequences 
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(inducible elements). One class of EMFs are nucleic acid fragments which induce the 
expression of an operably linked ORF in response to a specific regulatory factor or 
physiological event. 

The terms "nucleotide sequence" or Nucleic acid" or "polynucleotide" or 
"oligonucleotide" are used interchangeably and refer to a heteropolymer of nucleotides or 
the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or 
synthetic origin which may be single-stranded or double-stranded and may represent the 
sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like 
material. In the sequences herein A is adenine, C is cytosine, T is mymine, G is guanine and 
N is A, C, G, or T (U) or unknown. It is contemplated that where the polynucleotide is 
RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). 
Generally, nucleic acid segments provided by this invention may be assembled from 
fragments of the genome and short oligonucleotide linkers, or from a series of 
oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory 
elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of 
nucleotide residues which are at least about 5 nucleotides, more preferably at least about 7 
nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 1 1 
nucleotides and most preferably at least about 17 nucleotides. The fragment is preferably 
less than about 500 nucleotides, preferably less than about 200 nucleotides, more preferably 
less than about 100 nucleotides, more preferably less than about 50 nucleotides and most 
preferably less than 30 nucleotides. Preferably the probe is from about 6 nucleotides to 
about 200 nucleotides, preferably from about 15 to about 50 nucleotides, more preferably 
from about 17 to 30 nucleotides and most preferably from about 20 to 25 nucleotides. 
Preferably the fragments can be used in polymerase chain reaction (PCR), various 
hybridization procedures or microarray procedures to identify or amplify identical or related 
parts of mRNA or DNA molecules. A fragment or segment may uniquely identify each 
polynucleotide sequence of the present invention. Preferably the fragment comprises a 
sequence substantially similar to any one of SEQ ID NO: 1-1041, or 2083-2534. 

Probes may, for example, be used to determine whether specific mRNA molecules 
are present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal 
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DNA as described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). 
They may be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods 
well known in the art. Probes of the present invention, their preparation and/or labeling are 
elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold 

5 Spring Harbor Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in 
Molecular Biology, John Wiley & Sons, New York NY, both of which are incorporated 
herein by reference in their entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-2534. The 

10 sequence information can be a segment of any one of SEQ ID NO: 1-1041, or 2083-2534 
that uniquely identifies or represents the sequence information of that sequence of SEQ ID 
NO: 1-1041, or 2083-2534, or those segments identified in Tables 3, 5, 6, and 8. One such 
segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
mer is fully matched in the human genome is 1 in 300. Li the human genome, there are three 

1 5 billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers exist, there 
are 300 times more twenty-mers than there are base pairs in a set of human chromosomes. 
Using the same analysis, the probability for a seventeen-mer to be fully matched in the 
human genome is approximately 1 in 5. When these segments are used in arrays for 
expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 

20 fully matched in the expressed sequences is also approximately one in five because 

expressed sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment 
can be a twenty-five mer. The probability that the twenty-five mer would appear in a human 
genome with a single mismatch is calculated by multiplying the probability for a full match 

25 (1+4 25 ) times the increased probability for mismatch at each nucleotide position (3 x 25). The 
probability that an eighteen mer with a single mismatch can be detected in an array for 
expression studies is approximately one in five. The probability that a twenty-mer with a single 
mismatch can be detected in a human genome is approximately one in five. 

The term "open reading frame," ORF, means a series of nucleotide triplets coding for 

30 amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related 
nucleic acid sequences. For example, a promoter is operably associated or operably linked 
with a coding sequence if the promoter controls the transcription of the coding sequence. 
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While operably linked nucleic acid sequences can be contiguous and in the same reading 
frame, certain genetic elements e.g. repressor genes are not contiguously linked to the coding 
sequence but still control transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number 

5 of differentiated cell types that are present in an adult organism. A pluripotent cell is 
restricted in its differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an 
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to naturally 
occurring or synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a 

10 stretch of amino acid residues of at least about 5 amino acids, preferably at least about 7 
amino acids, more preferably at least about 9 amino acids and most preferably at least about 
17 or more amino acids. The peptide preferably is not greater than about 200 amino acids, 
more preferably less than 150 amino acids and most preferably less than 100 amino acids. 
Preferably the peptide is from about 5 to about 200 amino acids. To be active, any 

1 5 polypeptide must have sufficient length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells 
that have not been genetically engineered and specifically contemplates various polypeptides 
arising from post-translational modifications of the polypeptide including, but not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipi'dation and acylation. 

20 The term "translated protein coding portion" means a sequence which encodes for the 

full-length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a 
peptide or protein without a signal or leader sequence. The •'mature protein portion" means 
that portion of the protein which does not include a signal or leader sequence. The peptide 

25 may have been produced by processing in the cell which removes any leader/signal 

sequence. The mature protein portion may or may not include the initial methionine residue. 
The methionine residue may be removed from the protein during processing in the cell. The 
peptide may be produced synthetically or the protein may have been produced using a 
polynucleotide only encoding for the mature protein coding sequence. 

30 The term "derivative" refers to polypeptides chemically modified by such techniques 

as ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
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substitution by chemical synthesis of amino acids such as ornithine, which do not normally 
occur in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, 
5 e g., recombinant DNA techniques. Guidance in determining which amino acid residues 
may be replaced, added or deleted without abolishing activities of interest, may be found by 
comparing the sequence of the particular polypeptide with that of homologous peptides and 
minimizing the number of amino acid sequence changes made in regions of high homology 
(conserved regions) or by replacing amino acids with consensus sequence. 

10 Alternatively, recombinant variants encoding these same or similar polypeptides may 

be synthesized or selected by making use of the "redundancy" in the genetic code. Various 
codon substitutions, such as the silent changes which produce various restriction sites, may 
be introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be 

1 5 reflected in the polypeptide or domains of other peptides added to the polypeptide to modify 
the properties of any part of the polypeptide, to change characteristics such as ligand-binding 
affinities, interchain affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative 

20 amino acid replacements. "Conservative" amino acid substitutions may be made on the 
basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the 
amphipathic nature of the residues involved. For example, nonpolar (hydrophobic) amino 
acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and 
methionine; polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, 

25 asparagine, and glutamine; positively charged (basic) amino acids include arginine, lysine, 
and histidine; and negatively charged (acidic) amino acids include aspartic acid and glutamic 
acid. "Insertions" or "deletions" are preferably in the range of about 1 to 20 amino acids, 
more preferably 1 to 10 amino acids. The variation allowed may be experimentally 
determined by systematically making insertions, deletions, or substitutions of amino acids in 

30 a polypeptide molecule using recombinant DNA techniques and assaying the resulting 
recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such 
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alterations can, for example, alter one or more of the biological functions or biochemical 
characteristics of the polypeptides of the invention. For example, such alterations may 
change polypeptide characteristics such as ligand-binding affinities, interchain affinities, or 
degradation/turnover rate. Further, such alterations can be selected so as to generate 
5 polypeptides that are better suited for expression, scale up and the like in the host cells 
chosen for expression. For example, cysteine residues can be deleted or substituted with 
another amino acid residue in order to eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the 
indicated nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, 
more preferably at least 99% by weight, of the indicated biological macromolecules present 
(but water, buffers, and other small molecules, especially molecules having a molecular 
weight of less than 1 000 daltons, can be present). 

15 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated 

from at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic 
acid or polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide 
is found in the presence of (if anything) only a solvent, buffer, ion, or other component 
normally present in a solution of the same. The terms "isolated" and "purified" do not 

20 encompass nucleic acids or polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g. 9 microbial, insect, or 
mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins 
made in bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant 

25 microbial" defines a polypeptide or protein essentially free of native endogenous substances 
and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed 
in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; 
polypeptides or proteins expressed in yeast will have a glycosylation pattern in general 
different from those expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or 

virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression 
vehicle can comprise a transcriptional unit comprising an assembly of (1) a genetic element 
or elements having a regulatory role in gene expression, for example, promoters or 
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enhancers, (2) a structural or coding sequence which is transcribed into mRNA and 
translated into protein, and (3) appropriate transcription initiation and termination sequences. 
Structural units intended for use in yeast or eukaryotic expression systems preferably include 
a leader sequence enabling extracellular secretion of translated protein by a host cell. 
Alternatively, where recombinant protein is expressed without a leader or transport 
sequence, it may include an amino terminal methionine residue. This residue may or may 
not be subsequently cleaved from the expressed recombinant protein to provide a final 
product. 

The term "recombinant expression system" means host cells which have stably 
integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extrachromosomally. Recombinant expression systems as 
defined herein will express heterologous polypeptides or proteins upon induction of the 
regulatory elements linked to the DNA segment or synthetic gene to be expressed. This term 
also means host cells which have stably integrated a recombinant genetic element or 
elements having a regulatory role in gene expression, for example, promoters or enhancers. 
Recombinant expression systems as defined herein will express polypeptides or proteins 
endogenous to the cell upon induction of the regulatory elements linked to the endogenous 
DNA segment or gene to be expressed. The cells can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a 
membrane, including transport as a result of signal sequences in its amino acid sequence 
when it is expressed in a suitable host cell. "Secreted" proteins include without limitation 
proteins secreted wholly (e.g., soluble proteins) or partially (e.g., receptors) from the cell in 
which they are expressed. "Secreted" proteins also include without limitation proteins that 
are transported across the membrane of the endoplasmic reticulum. "Secreted" proteins are 
also intended to include proteins containing non-typical signal sequences (e.g. Ihterleukin-1 
Beta, see Krasney, PA. and Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors 
released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. 
(1998) Annu. Rev. Immunol. 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a 
sequence may be naturally present on the polypeptides of the present invention or provided 
from heterologous protein sources by recombinant DNA techniques. 
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The term "stringent" is used to refer to conditions that are commonly understood in 
the art as stringent. Stringent conditions can include highly stringent conditions (i.e., 
hybridization to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 
mM EDTA at 65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent 
conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization 
conditions are described herein in the examples. 

In instances of hybridization of deoxyoligonucleotides, additional exemplary 
stringent hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate 
at 37°C (for 14-base oligonucleotides), 48°C (for 17-base oligonucleotides), 55°C (for 20- 
base oligonucleotides), and 60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" or "substantially similar" can refer both to 
nucleotide and amino acid sequences, for example a mutant sequence, that varies from a 
reference sequence by one or more substitutions, deletions, or additions, the net effect of 
which does not result in an adverse functional dissimilarity between the reference and 
subject sequences. Typically, such a substantially equivalent sequence varies from one of 
those listed herein by no more than about 35% (i.e., the number of individual residue 
substitutions, additions, and/or deletions in a substantially equivalent sequence, as compared 
to the corresponding reference sequence, divided by the total number of residues in the 
substantially equivalent sequence is about 0.35 or less). Such a sequence is said to have 
65% sequence identity to the listed sequence. In one embodiment, a substantially 
equivalent, e.g., mutant, sequence of the invention varies from a listed sequence by no more 
than 30% (70% sequence identity); in a variation of this embodiment, by no more than 25% 
(75% sequence identity); and in a further variation of this embodiment, by no more than 
20% (80% sequence identity) and in a further variation of this embodiment, by no more than 
10% (90% sequence identity) and in a further variation of this embodiment, by no more that 
5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid sequences 
according to the invention preferably have at least 80% sequence identity with a listed amino 
acid sequence, more preferably at least 85% sequence identity, more preferably at least 90% 
sequence identity, more preferably at least 95% sequence identity, more preferably at least 
98% sequence identity, and most preferably at least 99% sequence identity. Substantially 
equivalent nucleotide sequence of the invention can have lower percent sequence identities, 
taking into account, for example, the redundancy or degeneracy of the genetic code. 
Preferably, the nucleotide sequence has at least about 65% identity, more preferably at least 
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about 75% identity, more preferably at least about 80% sequence identity, more preferably at 
least 85% sequence identity, more preferably at least 90% sequence identity, more preferably 
at least about 95% sequence identity, more preferably at least 98% sequence identity, and 
most preferably at least 99% sequence identity. For the purposes of the present invention, 
sequences having substantially equivalent biological activity and substantially equivalent 
expression characteristics are considered substantially equivalent. For the purposes of 
deternnfring equivalence, truncation of the mature sequence (e.g., via a mutation which 
creates a new stop codon) should be disregarded. Sequence identity may be determined, 
e.g., using the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). 
Identity between sequences can also be determined by other methods known in the art, e.g. 
by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the 

cell types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that 
the DNA is replicable, either as an extrachromosomal element, or by chromosomal 
integration. The term "transfection" refers to the taking up of an expression vector by a 
suitable host cell, whether or not any coding sequences are in fact expressed. The term 
"infection" refers to the introduction of nucleic acids into a suitable host cell by use of a 
virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of 
nucleotides which mediate the uptake-of a linked DNA fragment into a cell. UMFs can be 
readily identified using known UMFs as a target sequence or target motif with the 
computer-based systems described below. The presence and activity of a UMF can be 
connrmed by attaching the suspected UMF to a marker sequence. The resulting nucleic acid 
molecule is then incubated with an appropriate host under appropriate conditions and the 
uptake of the marker sequence is determined. As described above, a UMF will increase the 
frequency of uptake of a linked marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless 
the context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
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The isolated polynucleotides of the invention include a polynucleotide comprising 
the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534; a polynucleotide encoding 
any one of the peptide sequences of SEQ ID NO: 1-1041, or 2083-2534; and a 
polynucleotide comprising the nucleotide sequence encoding the mature protein coding 

5 sequence of the polynucleotides of any one of SEQ ID NO: 1-1041, or 2083-2534. The 
polynucleotides of the present invention also include, but are not limited to, a polynucleotide 
that hybridizes under stringent conditions to (a) the complement of any of the nucleotides 
sequences of SEQ ID NO: 1-1041, or 2083-2534; (b) nucleotide sequences encoding any one 
of the amino acid sequences set forth in the Sequence Listing, or Table 8; (c) a 

1 0 polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a 

polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) 
a polynucleotide that encodes a polypeptide comprising a specific domain or truncation of 
the polypeptides of SEQ ID NO: 1042-2082, or 2535-2986 (for example, as set forth in 
Tables 3, 5, 6, or 8). Domains of interest may depend on the nature of the encoded 

1 5 polypeptide; e.g., domains in receptor-like polypeptides include hgand-binding, 

extracellular, transmembrane, or cytoplasmic domains, or combinations thereof; domains in 
immunoglobuUn-like proteins include the variable immunoglobulin-like domains; domains 
in enzyme-like polypeptides include catalytic and substrate binding domains; and domains in 
ligand polypeptides include receptor-binding domains. 

20 The polynucleotides of the invention include naturally occurring or wholly or 

partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The 
polynucleotides may include entire coding region of the cDNA or may represent a portion of 
the coding region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences 

25 disclosed herein. The corresponding genes can be isolated in accordance with known methods 
using the sequence information disclosed herein. Such methods include the preparation of 
probes or primers from the disclosed sequence information for identification and/or 
amplification of genes in appropriate genomic libraries or other sources of genomic materials. 
Further 5' and 3' sequence can be obtained using methods known in the art. For example, full 

30 length cDNA or genomic DNA that corresponds to any of the polynucleotides of SEQ ID NO: 
1-1041, or 2083-2534 can be obtained by screening appropriate cDNA or genomic DNA 
libraries under suitable hybridization conditions using any of the polynucleotides of SEQ ID 
NO: 1-1041, or 2083-2534 or a portion thereof as a probe. Alternatively, the polynucleotides of 
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SEQ ID NO: 1-1041, or 2083-2534 may be used as the basis for suitable primer(s) that allow 
identification and/or amplification of genes in appropriate genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence 
information, representative fragment or segment information, or novel segment information for 

the full-length gene. 

The polynucleotides of the invention also provide polynucleotides including 
nucleotide sequences that are substantially equivalent to the polynucleotides recited above. 
Polynucleotides according to the invention can have, e.g., at least about 65%, at least about 
70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least 
about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 91%, 92%, 93%, 94%, 
and even more typically at least about 95%, 96%, 97%, 98%, 99% sequence identity to a 
polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic 
acid sequence fragments that hybridize under stringent conditions to any of the nucleotide 
sequences of SEQ ID NO: 1-1041, or 2083-2534, or complements thereof, which fragment is 
greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 
nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 
nucleotides or more that are selective for (i.e. specifically hybridize to) any one of the 
polynucleotides of the invention are contemplated. Probes capable of specifically 
hybridizing to a polynucleotide can differentiate polynucleotide sequences of the invention 
from other polynucleotide sequences in the same family of genes or can differentiate human 
genes from genes of other species, and are preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof Allelic and species 
variations can be routinely determined by comparing the sequence provided in SEQ ID NO: 1- 
1041, or 2083-2534, a representative fragment thereof, or a nucleotide sequence at least 90% 
identical, preferably 95% identical, to SEQ ID NO: 1-1041, or 2083-2534 with a sequence from 
another isolate of the same species. Furthermore, to accommodate codon variability, the 
invention includes nucleic acid molecules coding for the same amino acid sequences as do the 
specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of 
one codon for another codon that encodes the same amino acid is expressly contemplated. 
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The nearest neighbor or homology results for the nucleic acids of the present invention, 
including SEQ ID NO: 1 - 1041 , or 2083-2534 can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST (Basic Local Alignment Search Tool) program is 
used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
5 Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using FASTXY algorithm maybe performed. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are 
also provided hy the present invention. Species homologs may be isolated and identified by 
making suitable probes or primers from the sequences provided herein and screening a 
10 suitable nucleic acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which 
also encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

1 5 The nucleic acid sequences of the invention are further directed to sequences which 

encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic 

20 acids encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These 
nucleic acid alterations can be made at sites that differ in the nucleic acids from different 
species (variable positions) or in highly conserved regions (constant regions). Sites at such 
locations will typically be modified in series, e.g., by substituting first with conservative 

25 choices (e.g. , hydrophobic amino acid to a different hydrophobic amino acid) and then with 
more distant choices (e.g., hydrophobic amino acid to a charged amino acid), and then 
deletions or insertions may be made at the target site. Amino acid sequence deletions 
generally range from about 1 to 30 residues, preferably about 1 to 10 residues, and are 
typically contiguous. Amino acid insertions include amino- and/or carboxyl-terminal 

30 fusions ranging in length from one to one hundred or more residues, as well as intrasequence 
insertions of single or multiple amino acid residues. Intrasequence insertions may range 
generally from about 1 to 10 amino residues, preferably from 1 to 5 residues. Examples of 
terminal insertions include the heterologous signal sequences necessary for secretion or for 
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intracellular targeting in different host cells and sequences such as FLAG or poly-histidine 
sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter 
5 a polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 

nucleotides on both sides of the changed amino acid to form a stable duplex on either side of 
the site of being changed. In general, the techniques of site-directed mutagenesis are well 
known to those of skill in the art and this technique is exemplified by publications such as, 
Edelmanetal.,DA64 2:183 (1983). A versatile and efficient method for producing 

10 site-specific changes in a polynucleotide sequence was published by Zoller and Smith, 
Nucleic Acids Res, 10:6487-6500 (1982). PCR may also be used to create amino acid 
sequence variants of the novel nucleic acids. When small amounts of template DNA are 
used as starting material, primer(s) that differs slightly in sequence from the corresponding 
region in the template DNA can generate the desired amino acid variant. PCR amplification 

1 5 results in a population of product DNA fragments that differ from the polynucleotide 

template encoding the polypeptide at the position specified by the primer. The product DNA 
fragments replace the corresponding region in the plasmid and this gives a polynucleotide 
encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 

20 technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques 
well known in the art, such as, for example, the techniques in Sambrook et al., supra, and 
Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of 
the genetic code, other DNA sequences which encode substantially the same or a 
functionally equivalent amino acid sequence may be used in the practice of the invention for 

25 the cloning and expression of these novel nucleic acids. Such DNA sequences include those 
which are capable of hybridizing to the appropriate novel nucleic acid sequence under 
stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention could be 
used to generate polynucleotides encoding chimeric or fusion proteins comprising one or 
30 more domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of 
the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, 
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such 
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polynucleotides are well known to those of skill in the art and can include, for example, 
methods for determining hybridization conditions that can routinely isolate polynucleotides 
of the desired sequence identities. 

m accordance with the invention, polynucleotide sequences comprising the mature 

5 protein coding sequences corresponding to any one of SEQ ID NO: 1-1041, or 2083-2534, 
or functional equivalents thereof, maybe used to generate recombinant DNA molecules that 
direct the expression of that nucleic acid, or a functional equivalent thereof, in appropriate 
host cells. Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 

1 0 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et 
al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). 
Useful nucleotide sequences for joining to polynucleotides include an assortment of vectors, 
e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well 
known in the art. Accordingly, the invention also provides a vector including a 

1 5 polynucleotide of the invention and a host cell containing the polynucleotide. In general, the 
vector contains an origin of replication functional in at least one organism, convenient 
restriction endonuclease sites, and a selectable marker for the host cell. Vectors according to 
the invention include expression vectors, replication vectors, probe generation vectors, and 
sequencing vectors. A host cell according to the invention can be a prokaryotic or 

20 eukaryotic cell and can be a unicellular organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic 
acid having any of the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534 or a 
fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or viral 

25 vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1- 
1041, or 2083-2534 or a fragment thereof is inserted, in a forward or reverse orientation. In 
the case of a vector comprising one of the ORFs of the present invention, the vector may 
further comprise regulatory sequences, including for example, a promoter, operably linked to 
the ORF. Large numbers of suitable vectors and promoters are known to those of skill in the 

30 art and are commercially available for generating the recombinant constructs of the present 
invention. The following vectors are provided by way of example: Bacterial: pBs, 
phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a 
(Stratagene), pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: 
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pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL 
(Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufinan et aL, 

5 Nucleic Acids Res, 19, 4485-4490 (1991), in order to produce the protein recombinantly. 
Many suitable expression control sequences are known in the art. General methods of 
expressing recombinant proteins are also known and are exemplified in R. Kaufinan, 
Methods in Enzymology 185, 537-566 (1990). As defined herein "operably linked" means 
that the isolated polynucleotide of the invention and an expression control sequence are 

1 0 situated within a vector or cell in such a way that the protein is expressed by a host cell 

which has been transformed (transfected) with the ligated polynucleotide/expression control 
sequence. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 

1 5 appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include 
lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate 
early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse 
metallothionein-I. Selection of the appropriate vector and promoter is well within the level 
of ordinary skill in the art. Generally, recombinant expression vectors will include origins of 

20 replication and selectable markers permitting transformation of the host cell, e.g., the 

ampicillin resistance gene of is. coli and S. cerevisiae TRP1 gene, and a promoter derived 
from a highly expressed gene to direct transcription of a downstream structural sequence. 
Such promoters can be derived from operons encoding glycolytic enzymes such as 3- 
phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among 

25 others. The heterologous structural sequence is assembled in appropriate phase with 

translation initiation and termination sequences, and preferably, a leader sequence capable of 
directing secretion of translated protein into the periplasmic space or extracellular medium. 
Optionally, the heterologous sequence can encode a fusion protein including an amino 
terminal identification peptide imparting desired characteristics, e.g., stabilization or 

30 simplified purification of expressed recombinant product. Useful expression vectors for 
bacterial use are constructed by inserting a structural DNA sequence encoding a desired 
protein together with suitable translation initiation and termination signals in operable 
reading phase with a functional promoter. The vector will comprise one or more phenotypic 
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selectable markers and an origin of replication to ensure maintenance of the vector and to, if 
desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may 
5 also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication derived from 
commercially available plasmids comprising genetic elements of the well known cloning 
vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 

10 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. Following transformation of a suitable host strain 
and growth of the host strain to an appropriate cell density, the selected promoter is induced 
or derepressed by appropriate means (e.g., temperature shift or chemical induction) and cells 

15 are cultured for an additional period. Cells are typically harvested by centrifugation, 

disrupted by physical or chemical means, and the resulting crude extract retained for further 
purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et al., Nat. Biotech 17, 870-872 (1999), incorporated herein by 
20 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intra-muscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form 
of naked DNA. 

25 

4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules 
that are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO: 1-1041, or 2083-2534, or fragments, analogs or 
30 derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein, e.g. 9 complementary to the 
coding strand of a double-stranded cDNA molecule or complementary to an mRNA 
sequence. In specific aspects, antisense nucleic acid molecules are provided that comprise a 
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sequence complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an 
entire coding strand, or to only a portion thereof. Nucleic acid molecules encoding 
fragments, homologs, derivatives and analogs of a protein of any of SEQ ID NO: 1-1041, or 
2083-2534 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID 

5 NO: 1-1041, or 2083-2534 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence of the invention. The term "coding 
region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 

10 molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence 
of the invention. The term "noncoding region" refers to 5' and 3' sequences that flank the t 
coding region that are not translated into amino acids {i.e., also referred to as 5' and 3' 
untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein {e.g., 

15 SEQ ID NO: 1-1041, or 2083-2534, antisense nucleic acids of the invention can be designed 
according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic 
acid molecule can be complementary to the entire coding region of an mRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of an mRNA. For example, the antisense oligonucleotide can be complementary to 

20 the region surrounding the translation start site of an mRNA. An antisense oligonucleotide 
can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An 
antisense nucleic acid of the invention can be constructed using chemical synthesis or 
enzymatic ligation reactions using procedures known in the art. For example, an antisense 
nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using 

25 naturally occurring nucleotides or variously modified nucleotides designed to increase the 
biological stability of the molecules or to increase the physical stability of the duplex formed 
between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and 
acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic 

30 acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, 
xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5- 
carboxymethylammomethyl-2-thiouridine, 5-carboxymethylarninomethyluracil, 
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 
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1- methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3- 
methylcytosine-, 5-methylcytosine, N6-adenine, 7-methylguanine, 5- 
methylaminomethyluracil, 5-methoxyamiaoniethyl-2-thiouracil, beta-D-mannosylqueosine, 
5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, 

5 uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl- 

2- thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, 
uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, 
(acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced 
•biologically using an expression vector into which a nucleic acid has been subcloned in an 

1 0 antisense orientation (/. e. , RNA transcribed from the inserted nucleic acid will be of an 
antisense orientation to a target nucleic acid of interest, described further in the following 
subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 

1 5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of 
the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the 
case of an antisense nucleic acid molecule that binds to DNA duplexes, through specific 
interactions in the major groove of the double helix. An example of a route of 

20 administration of antisense nucleic acid molecules of the invention includes direct injection 
at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target 
selected cells and then administered systemically. For example, for systemic administration, 
antisense molecules can be modified such that they specifically bind to receptors or antigens 
expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to 

25 peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic 
acid molecules can also be delivered to cells using the vectors described herein. To achieve 
sufficient intracellular concentrations of antisense molecules, vector constructs in which the 
antisense nucleic acid molecule is placed under the control of a strong pol II or pol HI 
promoter are preferred. 

30 In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual oc-units, 
the strands run parallel to each other (Gaultier et aL (1987) Nucleic Acids Res 15: 
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6625-6641). The antisense nucleic acid molecule can also comprise a 
2'-o-methylribonucleotide (Inoue et al (1987) Nucleic Acids Res 15: 6131-6148) or a 
chimeric RNA -DNA analogue (Inoue et al (1987) FEBSLett 215: 327-330). 

5 4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of 
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a 
complementary region. Thus, ribozymes {e.g., hammerhead ribozymes (described in 

10 Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave 

mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having specificity 
for a nucleic acid of the invention can be designed based upon the nucleotide sequence of a 
DNA disclosed herein (i.e. 9 SEQ ID NO: 1-1041, or 2083-2534). For example, a derivative 
of Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the 

1 5 active site is complementary to the nucleotide sequence to be cleaved in a mRNA. See, e.g. , 
Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,1 16,742. Alternatively, 
mRNA of the invention can be used to select a catalytic RNA having a specific ribonuclease 
activity from a pool of RNA molecules. See, e.g., Bartel et al, (1993) Science 
261:1411-1418. 

20 Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region {e.g., promoter and/or enhancers) to form triple 
helical structures that prevent transcription of the gene in target cells. See generally, Helene. 
(1991) Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N. Y. Acad. Sci. 
660:27-36; and Maher (1992) Bioassays 14: 807-15. 

25 In various embodiments, the nucleic acids of the invention can be modified at the 

base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 
hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate peptide nucleic acids (see Hyrup 
et al. (1996) BioorgMed Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" 

30 or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the deoxyribose 
phosphate backbone is replaced by a pseudopeptide backbone and only the four natural 
nucleobases are retained. The neutral backbone of PNAs has been shown to allow for 
specific hybridization to DNA and RNA under conditions of low ionic strength. The 
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synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis 
protocols as described in Hyrup et al. (1996) above; Perry-O'Keefe et al. (1996) PNAS 93: 
14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 

5 example, PNAs can be used as antisense or antigene agents for sequence-specific modulation 
of gene expression by, e.g., inducing transcription or translation arrest or inhibiting 
replication. PNAs of the invention can also be used, e.g., in the analysis of single base pair 
mutations in a gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes 
when used in combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); 

10 or as probes or primers for DNA sequence and hybridization (Hyrup et al (1996), above; 
Perry-O'Keefe (1996), above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance 
their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by 
the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 

15 delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms of 

20 base stacking, number of bonds between the nucleobases, and orientation (Hyrup (1 996) 
above). The synthesis of PNA-DNA chimeras can be performed as described in Hyrup 
(1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a DNA chain 
can be synthesized on a solid support using standard phosphoramidite coupling chemistry, 
and modified nucleoside analogs, e.g., 5'-(4-memoxytrityl)anmio-5 , -deoxy-mymidine 

25 phosphoramidite, can be used between the PNA and the 5' end of DNA (Mag et al. (1989) 
Nucl Acid Res 17: 5973-88). PNA monomers are then coupled in a stepwise manner to 
produce a chimeric molecule with a 5' PNA segment and a 3' DNA segment (Finn et al. 
(1996) above). Alternatively, chimeric molecules can be synthesized with a 5' DNA 
segment and a 3' PNA segment. See, Petersen et al. (1975) BioorgMed Chem Lett 5: 

30 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such 
as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport 
across the cell membrane (see, e.g., Letsinger et al, 1989, Proc. Natl. Acad. Sci. U.S.A. 
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86:6553-6556; Lemaitre et a!. 9 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication 
No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). 
In addition, oligonucleotides can be modified with hybridization triggered cleavage agents 
(See, e.g., Krol et al. 9 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., 
5- Zon, 1988, Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to 
another molecule, e.g. , a peptide, a hybridization triggered cross-linking agent, a transport 
agent, a hybridization-triggered cleavage agent, etc. 



4.5 HOSTS 

10 The present invention further provides host cells genetically engineered to contain 

the polynucleotides of the invention. For example, such host cells may contain nucleic acids 
of the invention introduced into the host cell using known transformation, transfection or 
infection methods. The present invention still further provides host cells genetically 
engineered to express the polynucleotides of the invention, wherein such polynucleotides are 

15 in operative association with a regulatory sequence heterologous to the host cell which 
drives expression of the polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in 

20 whole or in part, the naturally occurring promoter with all or part of a heterologous promoter 
so that the cells express the polypeptide at higher levels. The heterologous promoter is 
inserted in such a manner that it is operatively linked to the encoding sequences. See, for 
example, PCT International Publication No. WO94/12650, PCT International Publication 
No. WO92/20808, and PCT International Publication No. WO91/09955. It is also 

25 contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA 
(e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate 
synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be 
inserted along with the heterologous promoter DNA. If linked to the coding sequence, 
amplification of the marker DNA by standard selection methods results in co-amplification 

30 of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
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calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation 
(Davis, L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one 
of the polynucleotides of. the invention, can be used in conventional manners to produce the 
gene product encoded by the isolated fragment (in the case of an ORF) or can be used to 

5 produce a heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the 
present invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, 
Cv-1 cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and 
B. subtilis. The most preferred cells are those which do not normally express the particular 

1 0 polypeptide or protein or which expresses the polypeptide or protein at low natural level. 
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under 
the control of appropriate promoters. Cell-free translation systems can also be employed to 
produce such proteins using RNAs derived from the DNA constructs of the present 
invention. Appropriate cloning and expression vectors for use with prokaryotic and 

15 eukaryotic hosts are described by Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual, Second Edition, Cold Spring Harbor, New York (1989), the disclosure of which is 
hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 

20 of monkey kidney fibroblasts, described by Gluzman, Cell 23: 175 (1981). Other cell lines 
capable of expressing a compatible vector are, for example, the CI 27, monkey COS cells, 
Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A431 cells, 
human Colo205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal 
diploid cells, cell strains derived from in vitro culture of primary tissue, primary explants, 

25 HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or Jurkat cells. Mammalian expression 
vectors will comprise an origin of replication, a suitable promoter and also any necessary 
ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional 
termination sequences, and 5' flanking nontranscribed sequences. DNA sequences derived 
from the SV40 viral genome, for example, SV40 origin, early promoter, enhancer, splice, 

30 and polyadenylation sites may be used to provide the required nontranscribed genetic 

elements. Recombinant polypeptides and proteins produced in bacterial culture are usually 
isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous 
ion exchange or size exclusion chromatography steps. Protein refolding steps can be used, 
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as necessary, in completing configuration of the mature protein. Finally, high performance 
liquid chromatography (HPLC) can be employed for final purification steps. Microbial cells 
employed in expression of proteins can be disrupted by any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. 
5 Alternatively, it may be possible to produce the protein in lower eukaryotes such as 

yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, 
or any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
10 strain capable of expressing heterologous proteins. If the protein is made in yeast or 
bacteria, it may be necessary to modify the protein produced therein, for example by 
phosphorylation or glycosylation of the appropriate sites, in order to obtain the functional 
protein. Such covalent attachments may be accomplished using known chemical or 
enzymatic methods. 

15 In another embodiment of the present invention, cells and tissues may be engineered 

to express an endogenous gene comprising the polynucleotides of the invention under the 
control of inducible regulatory elements, in which case the regulatory sequences of the 
endogenous gene may be replaced by homologous recombination. As described herein, gene 
targeting can be used to replace a gene's existing regulatory region with a regulatory 

20 sequence, isolated from a different gene or a novel regulatory sequence synthesized by 

genetic engineering methods. Such regulatory sequences may be comprised of promoters, 
enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional 
initiation sites, and regulatory protein binding sites or combinations of said sequences. 
Alternatively, sequences which affect the structure or stability of the RNA or protein 

25 produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequence include polyadenylation signals, mRNA stability elements, splice sites, leader 
sequences for enhancing or modifying transport or secretion properties of the protein, or 
other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

30 The targeting event may be a simple insertion of the regulatory sequence, placing the 

gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 
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element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different 
cell-type specificity than the naturally occurring elements. Here, the naturally occurring 
sequences are deleted and new sequences are added. In all cases, the identification of the 
5 targeting event may be facilitated by the use of one or more selectable marker genes that are 
contiguous with the targeting DNA, allowing for the selection of cells in which the 
exogenous DNA has integrated into the host cell genome. The identification of the targeting 
event may also be facilitated by the use of one or more marker genes exhibiting the property 
of negative selection, such that the negatively selectable marker is linked to the exogenous 

10 DNA, but configured such that the negatively selectable marker flanks the targeting 

sequence, and such that a correct homologous recombination event with sequences in the 
host cell genome does not result in the stable integration of the negatively selectable marker. 
Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) 
gene or the bacterial xanlhine-guanine phosphoribosyl-transferase (gpt) gene. 

15 The gene targeting or gene activation techniques which can be used in accordance 

with this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 
to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by 

20 reference herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1042- 

25 2082, or 2535-2986 or an amino acid sequence encoded by any one of the nucleotide 

sequences SEQ ID NO: 1-1041, or 2083-2534 or the corresponding foil length or mature 
protein. Polypeptides of the invention also include polypeptides preferably with biological or 
immunological activity that are encoded by: (a) a polynucleotide having any one of the 
nucleotide sequences set forth in SEQ ID NO: 1-1041, or 2083-2534 or (b) polynucleotides 

30 encoding any one of the amino acid sequences set forth as SEQ ID NO: 1042-2082, or 2535- 
2986 or (c) polynucleotides that hybridize to the complement of the polynucleotides of either 
(a) or (b) under stringent hybridization conditions. The invention also provides biologically 
active or immunologically active variants of any of the amino acid sequences set forth as 
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SEQ ID NO: 1042-2082, or 2535-2986 or the corresponding full length or mature protein; 
and "substantial equivalents" thereof (e.g., with at least about 65%, at least about 70%, at 
least about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%, at least about 
90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more typically at least 
5 about 98%, or most typically at least about 99% amino acid identity) that retain biological 
activity. Polypeptides encoded by allelic variants may have a similar, increased, or 
decreased activity compared to polypeptides comprising SEQ ID NO: 1042-2082, or 2535- 
2986. 

Fragments of the proteins of the present invention which are capable of exhibiting 

10 biological activity are also encompassed by the present invention. Fragments of the protein 
may be in linear form or they may be cyclized using known methods, for example, as 
described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. 
McDowell, et al., J. Amer. Chem. Soc. 1 14, 9245-9253 (1992), both of which are 
incorporated herein by reference. Such fragments may be fused to carrier molecules such as 

1 5 immunoglobulins for many purposes, including increasing the valency of protein binding 
sites. Fragments are also identified in Tables 3, 5, 6, and 8. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein 
coding sequence is identified in the sequence listing by translation of the disclosed 

20 nucleotide sequences. The predicted signal sequence is set forth in Table 6. The mature 
form of such protein may be obtained and confirmed by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell and sequencing of the cleaved 
product One of skill in the art will recognize that the actual cleavage site may be different 
than that predicted in Table 6. The sequence of the mature form of the protein is also 

25 determinable from the amino acid sequence of the full-length form. Where proteins of the 
present invention are membrane bound, soluble forms of the proteins are also provided. In 
such forms, part or all of the regions causing the proteins to be membrane bound are deleted 
so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable 

30 carrier, such as a hydrophilic, e.g. , pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic 
acid fragments of the present invention or by degenerate variants of the nucleic acid 
fragments of the present invention. By "degenerate variant" is intended nucleotide 
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fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) 
by nucleotide sequence but, due to the degeneracy of the genetic code, encode an identical 
polypeptide sequence. Preferred nucleic acid fragments of the present invention are the 
ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino 
acid sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or 
tertiary structural and/or conformational characteristics with proteins may possess biological 
properties in common therewith, including protein activity. This technique is particularly 
useful in producing small peptides and fragments of larger polypeptides. Fragments are 
useful, for example, in generating antibodies against the native polypeptide. Thus, they may 
be employed as biologically active or immunological substitutes for natural, purified 
proteins in screening of therapeutic compounds and in immunological processes for the 
development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified 
from cells which have been altered to express the desired polypeptide or protein. As used 
herein, a cell is said to be altered to express a desired polypeptide or protein when the cell, 
through genetic manipulation, is made to produce a polypeptide or protein which it normally 
does not produce or which the cell normally produces at a lower level. One skilled in the art 
can readily adapt procedures for introducing and expressing either recombinant or synthetic 
sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one 
of the polypeptides or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising 
growing a culture of host cells of the invention in a suitable culture medium, and purifying 
the protein from the cells or the culture in which the cells are grown. For example, the 
methods of the invention include a process for producing a polypeptide in which a host cell 
containing a suitable expression vector that includes a polynucleotide of the invention is 
cultured under conditions that allow expression of the encoded polypeptide. The 
polypeptide can be recovered from the culture, conveniently from the culture medium, or 
from a lysate prepared from the host cells and further purified. Preferred embodiments 
include those in which the protein produced by such process is a full length or mature form 
of the protein. 
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In an alternative method, the polypeptide or protein is purified from bacterial cells 
which naturally produce the polypeptide or protein. One skilled in the art can readily follow 
known methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
5 immunochromatography, HPLC, size-exclusion chromatography, ion-exchange 
chromatography, and immuno-affinity chromatography. See, e.g., Scopes, Protein 
Purification: Principles and Practice, Springer- Verlag (1994); Sambrook, et al., in 
Molecular Cloning: A Laboratory Manual; Ausubel et al., Current Protocols in Molecular 
Biology. Polypeptide fragments that retain biological/immunological activity include 

10 fragments comprising greater than about 100 amino acids, or greater than about 200 amino 
acids, and fragments that encode specific protein domains. 

The purified polypeptides can be used in in vitro binding assays which are well 
known in the art to identify molecules which bind to the polypeptides. These molecules 
include but are not limited to, for e.g., small molecules, molecules from combinatorial 

15 libraries, antibodies or other proteins. The molecules identified in the binding assay are then 
tested for antagonist or agonist activity in in vivo tissue culture or animal models that are 
well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or 
animals and then tested for either cell/animal death or prolonged survival of the animal/cells. 
In addition, the peptides of the invention or molecules capable of binding to the 

20 peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds that 
are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor or other 
cell by the specificity of the binding molecule for SEQ ID NO: 1042-2082, or 2535-2986. 

The protein of the invention may also be expressed as a product of transgenic 
animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are 

25 characterized by somatic or germ cells containing a nucleotide sequence encoding the 
protein. 

The proteins provided herein also include proteins characterized by amino acid 
sequences similar to those of purified proteins but into which modification are naturally 
provided or deliberately engineered. For example, modifications, in the peptide or DNA 
30 sequence, can be made by those skilled in the art using known techniques. Modifications of 
interest in the protein sequences may include the alteration, substitution, replacement, 
insertion or deletion of a selected amino acid residue in the coding sequence. For example, 
one or more of the cysteine residues may be deleted or replaced with another amino acid to 
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alter the conformation of the molecule. Techniques for such alteration, substitution, 
replacement, insertion or deletion are well known to those skilled in the art (see, e.g., U.S. 
Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement, insertion or 
deletion retains the desired activity of the protein. Regions of the protein that are important 
5 for the protein function can be determined by various methods known in the art including the 
alanine-scanning method which involved systematic substitution of single or strings of 
amino acids with alanine, followed by testing the resulting alanine-containing variant for 
biological activity. This type of analysis determines the importance of the substituted amino 
acid(s) in biological activity. Regions of the protein that are important for protein function 

10 may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be 
expected to retain protein activity in whole or in part and are useful for screening or other 
immunological methodologies may also be easily made by those skilled in the art given the 
disclosures herein. Such modifications are encompassed by the present invention. 

15 The protein may also be produced by operably linking the isolated polynucleotide of 

the invention to suitable control sequences in one or more insect expression vectors, and 
employing an insect expression system. Materials and methods for baculovirus/insect cell 
expression systems are commercially available in kit form from, e.g., Invitrogen, San Diego, 
Calif., U.S.A. (the MaxBat™ kit), and such methods are well known in the art, as described 

20 in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), 
incorporated herein by reference. As used herein, an insect cell capable of expressing a 
polynucleotide of the present invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells 
under culture conditions suitable to express the recombinant protein. The resulting 

25 expressed protein may then be purified from such culture (/. e. , from culture medium or cell 
extracts) using known purification processes, such as gel filtration and ion exchange 
chromatography. The purification of the protein may also include an affinity column 
containing agents which will bind to the protein; one or more column steps over such affinity 
resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; 

30 one or more steps involving hydrophobic interaction chromatography using such resins as 
phenyl ether, butyl ether, or propyl ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
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maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as 
a His tag. Kits for expression and purification of such fusion proteins are commercially 
available from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N J.) and 
Invitrogen, respectively. The protein can also be tagged with an epitope and subsequently 
5 purified by using a specific antibody directed to such epitope. One such epitope ("FLAG®") 
is commercially available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- 
HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant 
methyl or other aliphatic groups, can be employed to further purify the protein. Some or all 

10 of the foregoing purification steps, in various combinations, can also be employed to provide 
a substantially homogeneous isolated recombinant protein. The protein thus purified is 
substantially free of other mammalian proteins and is defined in accordance with the present 
invention as an "isolated protein." 

The polypeptides of the invention include analogs (variants). This embraces 

15 fragments, as well as peptides in which one or more amino acids has been deleted, inserted, 
or substituted. Also, analogs of the polypeptides of the invention embrace fusions of the 
polypeptides or modifications of the polypeptides of the invention, wherein the polypeptide 
or analog is fused to another moiety or moieties, e.g., targeting moiety or another therapeutic 
agent. Such analogs may exhibit improved properties such as activity and/or stability. 

20 Examples of moieties which may be fused to the polypeptide or an analog include, for 

example, targeting moieties which provide for the delivery of polypeptide to pancreatic cells, 
e.g., antibodies to pancreatic cells, antibodies to immune cells such as T-cells, monocytes, 
dendritic cells, granulocytes, etc., as well as receptor and ligands expressed on pancreatic or 
immune cells. Other moieties which may be fused to the polypeptide include therapeutic 

25 agents which are used for treatment, for example, immunosuppressive drugs such as 

cyclosporin, SK506, azathioprine, CD3 antibodies and steroids. Also, polypeptides may be 
fused to immune modulators, and other cytokines such as alpha or beta interferon. 



4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE 
30 IDENTITY AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between 
the sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
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(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WT), BLASTP, BLASTN, BLASTX, FASTA (Altschul, 
S.F. et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al. Nucleic 
Acids Res. vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu 
5 et al, J. Comp. Biol, Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif 
software (NeviU-Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by 
reference), Pfam software (Sonnhammer et al. Nucleic Acids Res, Vol. 26(1), pp. 320-322 
(1998), herein incorporated by reference) and the Kyte-Doolittle hydrophobocity prediction 
algorithm (J. Mol Biol, 157, pp. 105-31 (1982), incorporated herein by reference). 

10 polypeptide sequences were examined by a proprietary algorithm, SeqLoc that separates the 
proteins into three sets of locales: intracellular, membrane, or secreted. This prediction is 
based upon three characteristics of each polypeptide, including percentage of cysteine 
residues, Kyte-Doolittle scores for the first 20 amino acids of each protein, and Kyte- 
Doolittle scores to calculate the longest hydrophobic stretch of the said protein. Values of 

15 predicted proteins are compared against the values from a set of 592 proteins of known 

cellular localization from the Swissprot database (http ://www.expasy.ch/sprofl . Predictions 
are based upon the maximum likelihood estimation. 

The BLAST programs are publicly available from the National Center for 
Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul, S, et al. 

20 NCBI NLM NM Bethesda, MD 20894; Altschul, S, et al, J. Mol. BioL 215:403-410 
(1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 

25 another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. Jn another embodiment, a fusion protein comprises at least two biologically 
active portions of a protein according to the invention. Within the fusion protein, the term 

30 "operatively linked" is intended to indicate that the polypeptide according to the invention 
and the other polypeptide are fused in-frame to each other. The polypeptide can be fused to 
the N-terminus or C-tenninus, or to the middle. 
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For example, in one embodiment a fusion protein comprises a polypeptide according 
to the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., 
glutathione S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in 
which the polypeptide sequences according to the invention comprise one or more domains 
fused to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in 
vivo. The immunoglobulin fusion proteins can be used to affect the bioavailability of a 
cognate ligand. Inhibition of the ligand/protein interaction may be useful therapeutically for 
both the treatment of proliferative and differentiative disorders, e.g., cancer as well as 
modulating (e.g., promoting or inhibiting) cell survival. Moreover, the immunoglobulin 
fusion proteins of the invention can be used as immunogens to produce antibodies in a 
subject, to purify ligands, and in screening assays to identify molecules that inhibit the 
interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g., by employing blunt-ended or stagger-ended termini for ligation, restriction 
enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as 
appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic 
ligation. In another embodiment, the fusion gene can be synthesized by conventional 
techniques including automated DNA synthesizers. Alternatively, PCR amplification of 
gene fragments can be carried out using anchor primers that give rise to complementary 
overhangs between two consecutive gene fragments that can subsequently be annealed and 
reamplified to generate a chimeric gene sequence (see, for example, Ausubel et al. (eds.) 
Current Protocols in Molecular Biology, John Wiley & Sons, 1 992). Moreover, 
many expression vectors are commercially available that already encode a fusion moiety 
(e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the invention can be 
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cloned into such an expression vector such that the fusion moiety is linked in-frame to the 
protein of the invention. 

4.8 GENE THERAPY 
5 Mutations in the polynucleotides of the invention gene may result in loss of normal 

function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides 
of the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more 

10 particularly viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo 
by use of physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for 
example, Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For 
additional reviews of gene therapy technology see Friedmann, Science, 244: 1275-1281 
(1989); Veima, Scientific American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). 

1 5 Introduction of any one of the nucleotides of the present invention or a gene encoding the 
polypeptides of the present invention can also be accomplished with extrachromosomal 
substrates (transient expression) or artificial chromosomes (stable expression). Cells may 
also be cultured ex vivo in the presence of proteins of the present invention in order to 
proliferate or to produce a desired effect on or activity in such cells. Treated cells can then 

20 be introduced in vivo for therapeutic purposes. Alternatively, it is contemplated that in other 
human disease states, preventing the expression of or inhibiting the activity of polypeptides 
of the invention will be useful in treating the disease states. It is contemplated that antisense 
therapy or gene therapy could be applied to negatively regulate the expression of 
polypeptides of the invention. 

25 Other methods inhibiting expression of a protein include the introduction of antisense 

molecules to the nucleic acids of the present invention, their complements, or their translated 
RNA sequences, by methods known in the art. Further, the polypeptides of the present 
invention can be inhibited by using targeted deletion methods, or the insertion of a negative 
regulatory element such as a silencer, which is tissue specific. 

30 The present invention still further provides cells genetically engineered in vivo to 

express the polynucleotides of the invention, wherein such polynucleotides are in operative 
association with a regulatory sequence heterologous to the host cell which drives expression of 
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the polynucleotides in the cell. These methods can be used to increase or decrease the 
expression of the polynucleotides of the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of 
cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be 
5 modified (e.g., by homologous recombination) to provide increased polypeptide expression by 
replacing, in whole or in part, the naturally occurring promoter with all or part of a heterologous 
promoter so that the cells express the protein at higher levels. The heterologous promoter is 
inserted in such a manner that it is operatively linked to the desired protein encoding sequences. 
See, for example, PCT International Publication No. WO 94/12650, PCT International 

10 Publication No. WO 92/20808, and PCT International Publication No. WO 91/09955. It is also 
contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA (e.g., 
ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate synthase, 
aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be inserted along with 
the heterologous promoter DNA. If linked to the desired protein coding sequence, 

1 5 amplification of the marker DNA by standard selection methods results in co-amplification of 
the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control 
of inducible regulatory elements, in which case the regulatory sequences of the endogenous 

20 gene may be replaced by homologous recombination. As described herein, gene targeting can 
be used to replace a gene's existing regulatory region with a regulatory sequence isolated from 
a different gene or a novel regulatory sequence synthesized by genetic engineering methods. 
Such regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment 
regions, negative regulatory elements, transcriptional initiation sites, regulatory protein binding 

25 sites or combinations of said sequences. Alternatively, sequences which affect the structure or 
stability of the RNA or protein produced may be replaced, removed, added, or otherwise 
modified by targeting. These sequences include polyadenylation signals, mRNA stability 
elements, splice sites, leader sequences for enhancing or modifying transport or secretion 
properties of the protein, or other sequences which alter or improve the function or stability of 

30 protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, eg., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
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deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 
element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different cell-type 
specificity than the naturally occurring elements. Here, the naturally occurring sequences are 
5 deleted and new sequences are added In all cases, the identification of the targeting event may 
be facilitated by the use of one or more selectable marker genes that are contiguous with the 
targeting DNA, allowing for the selection of cells in which the exogenous DNA has integrated 
into tiie cell genome. The identification of the targeting event may also be facilitated by the use 
of one or more marker genes exhibiting the property of negative selection, such that the 

1 0 negatively selectable marker is linked to the exogenous DNA, but configured such that the 

negatively selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 

1 5 phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

20 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by 
reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
25 invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination 
30 are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, 
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. 
Transgenic animals are useful to determine the roles polypeptides of the invention play in 
biological processes, and preferably in disease states. Transgenic animals are useful as model 
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systems to identify compounds that modulate lipid metabolism. Transgenic animals, 
preferably non-human mammals, are produced using methods as described in U.S. Patent No 
5,489,743 and PCT Publication No. W094/28122, incorporated herein by reference. 
Transgenic animals can be prepared wherein all or part of a promoter of the 
5 polynucleotides of the invention is either activated or inactivated to alter the level of 
expression of the polypeptides of the invention. Inactivation can be carried out using 
homologous recombination methods described above. Activation can be achieved by 
supplementing or even replacing the homologous promoter to provide for increased protein 
expression. The homologous promoter can be supplemented by insertion of one or more 

10 heterologous enhancer elements known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to 
express polypeptides of the invention or that express a variant polypeptide. Such animals are 
useful as models for studying the in vivo activities of polypeptide as well as for studying 

15 modulators of the polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 

20 control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination 
are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, 
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. 
Transgenic animals are useful to determine the roles polypeptides of the invention play in 

25 biological processes, and preferably in disease states. Transgenic animals are useful as model 
systems to identify compounds that modulate lipid metabolism. Transgenic animals, 
preferably non-human mammals, are produced using methods as described in U.S. Patent No 
5,489,743 and PCT Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 

30 invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or 
even replacing the homologous promoter to provide for increased protein expression. The 
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homologous promoter can be supplemented by insertion of one or more heterologous 
enhancer elements known to confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

5 The polynucleotides and proteins of the present invention are expected to exhibit one 

or more of the uses or biological activities (including those associated with assays cited 
herein) identified herein. Uses or activities described for proteins of the present invention 
may be provided by administration or use of such proteins or of polynucleotides encoding 
such proteins (such as, for example, in gene therapies or vectors suitable for introduction of 

10 DNA). The mechanism underlying the particular condition or pathology will dictate whether 
the polypeptides of the invention, the polynucleotides of the invention or modulators 
(activators or inhibitors) thereof would be beneficial to the subject in need of treatment. 
Thus, 'therapeutic compositions of the invention" include compositions comprising isolated 
polynucleotides (including recombinant DNA molecules, cloned genes and degenerate 

15 variants thereof) or polypeptides of the invention (including full length protein, mature 
protein and truncations or domains thereof), or compounds and other substances that 
modulate the overall activity of the target gene products, either at the level of target 
gene/protein expression or target protein activity. Such modulators include polypeptides, 
analogs, (variants), including fragments and fusion proteins, antibodies and other binding 

20 proteins; chemical compounds that directly or indirectly activate or inhibit the polypeptides 
of the invention (identified, e.g., via drug screening assays as described herein); antisense 
polynucleotides and polynucleotides suitable for triple helix formation; and in particular 
antibodies or other binding partners that specifically recognize one or more epitopes of the 
polypeptides of the invention. 

25 The polypeptides of the present invention may likewise be involved in cellular 

activation or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
30 community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage 
of tissue differentiation or development or in disease states); as molecular weight markers on 
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gels; as chromosome markers or tags (when labeled) to identify chromosomes or to map 
related gene positions; to compare with endogenous DNA sequences in patients to identify 
potential genetic disorders; as probes to hybridize and thus discover novel, related DNA 
sequences; as a source of information to derive PCR primers for genetic fingerprinting; as a 
5 probe to "subtract-out" known sequences in the process of discovering other novel 

polynucleotides; for selecting and making oligomers for attachment to a "gene chip" or other 
support, including for examination of expression patterns; to raise anti-protein antibodies 
using DNA immunization techniques; and as an antigen to raise anti-DNA antibodies or 
elicit another immune response. Where the polynucleotide encodes a protein which binds or 

10 potentially binds to another protein (such as, for example, in a receptor-ligand interaction), 
the polynucleotide can also be used in interaction trap assays (such as, for example, that 
described in Gyuris et al., Cell 75:791-803 (1993)) to identify polynucleotides encoding the 
other protein with which binding occurs or to identify inhibitors of the binding interaction. 
The polypeptides provided by the present invention can similarly be used in assays to 

1 5 determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including 
the labeled reagent) in assays designed to quantitatively determine levels of the protein (or 
its receptor) in biological fluids; as markers for tissues in which the corresponding 
polypeptide is preferentially expressed (either constitutively or at a particular stage of tissue 

20 differentiation or development or in a disease state); and, of course, to isolate correlative 
receptors or ligands. Proteins involved in these binding interactions can also be used to 
screen for peptide or small molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent 
grade or kit format for commercialization as research products. 

25 Methods for performing the uses listed above are well known to those skilled in the 

art. References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. 
Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular 
Cloning Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

30 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as 
nutritional sources or supplements. Such uses include without limitation use as a protein or 
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amino acid supplement, use as a carbon source, use as a nitrogen source and use as a source of 
carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be added to 
the feed of a particular organism or can be administered as a separate solid or liquid 
preparation, such as in the form of powder, pills, solutions, suspensions or capsules. In the case 
5 of microorganisms, the polypeptide or polynucleotide of the invention can be added to the 
medium in or on which the microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or 
inhibiting) activity or may induce production of other cytokines in certain cell populations. 
A polynucleotide of the invention can encode a'polypeptide exhibiting such attributes. 
Many protein factors discovered to date, including all known cytokines, have exhibited 
activity in one or more factor-dependent cell proliferation assays, and hence the assays serve 
as a convenient confirmation of cytokine activity. The activity of therapeutic compositions 
of the present invention is evidenced by any one of a number of routine factor dependent cell 
proliferation assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, 
B9/11, BaF3, MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, 
Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions of the invention can be used in 
the following: 

Assays for T-cell or thymocyte proliferation include without limitation those 
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; 
Bertagnolli et al., J. Immunol. 145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 
133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. 
Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells 
or thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of 
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mouse and human interleukin-Y, Schreiber, R. D. In Current Protocols in Immunology. J. E. 
e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine 
5 Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current 
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and 
Sons, Toronto. 1991; deVries et aL, J. Exp. Med. 173:1205-1211, 1991; Moreau et aL, 
Nature 336:690-692, 1988; Greenberger et aL, Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 
1983; Measurement of mouse and human interleukin 6— Nordan, R. In Current Protocols in 

10 Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; 
Smith et aL, Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; Measurement of human 
Interleukin 1 1 -Bennett, R, Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols 
in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John Wiley and Sons, Toronto. 1991; 
Measurement of mouse and human Interleukin 9— Ciarletta, A., Giannotti, J. 9 Clark, S. C. 

15 and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, 
John Wiley and Sons, Toronto. 1991. 

♦ 

Assays for T-cell clone responses to antigens (which will identify, among others, 
proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring 
proliferation and cytokine production) include, without limitation, those described in: 

20 Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, 
E. M. Shevach, W Strober, Pub. Greene Publishing Associates and Wiley-Merscience 
(Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 6, Cytokines and their 
cellular receptors; Chapter 7, Immunologic studies in Humans); Weinberger et al., Proc. 
Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., Eur. J. Immun. 11:405-411, 

25 1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988. 



4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity 
30 and be involved in the proliferation, differentiation. and survival of pluripotent and totipotent 
stem cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells 
and/or germ line stem cells. Administration of the polypeptide of the invention to stem cells 
in vivo or ex vivo is expected to maintain and expand cell populations in a totipotential or 
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pluripotential state which would be useful for re-engineering damaged or diseased tissues, 
transplantation, manufacture of bio-pharmaceuticals and the development of bio-sensors. 
The ability to produce large quantities of human cells has important working applications for 
the production of human proteins which currently must be obtained from non-human sources 
5 or donors, implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other 
neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, 
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, 
gastrointestinal cells and others; and organs for transplantation such as kidney, liver, 
pancreas (including islet cells), heart and lung. 

10 It is contemplated that multiple different exogenous growth factors and/or cytokines 

may be administered in combination with the polypeptide of the invention to achieve the 
desired effect, including any of the growth factors listed herein, other stem cell maintenance 
factors, and specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), 
Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL- 

15 6, macrophage inflammatory protein 1-alpha (MIP-1 -alpha), G-CSF, GM-CSF, 

thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth factor (PDGF), 
neural growth factors and basic fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion 
of these cells in culture will facilitate the production of large quantities of mature cells. 

20 Techniques for culturing stem cells are known in the art and administration of polypeptides 
of the invention, optionally with other growth factors and/or cytokines, is expected to 
enhance the survival and proliferation of the stem cell populations. This can be 
accomplished by direct administration of the polypeptide of the invention to the culture 
medium. Alternatively, stroma cells transfected with a polynucleotide that encodes for the 

25 polypeptide of the invention can be used as a feeder layer for the stem cell populations in 
culture or in vivo. Stromal support cells fori feeder layers may include embryonic bone 
marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or cultured embryonic 
fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to 

30 induce autocrine expression of the polypeptide of the invention. This will allow for 

generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as is 
or that can then be differentiated into the desired mature cell types. These stable cell lines 
can also serve as a source of undifferentiated totipotential/pluripotential mRNA to create 
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cDNA libraries and templates for polymerase chain reaction experiments. These studies 
would allow for the isolation and identification of differentially expressed genes in stem cell 
populations that regulate stem cell proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
5 treatment of many pathological conditions. For example, polypeptides of the present 

invention may be used to manipulate stem cells in culture to give rise to neuroepithelial cells 
that can be used to augment or replace cells damaged by illness, autoimmune disease, 
accidental damage or genetic disorders. The polypeptide of the invention may be useful for 
inducing the proliferation of neural cells and for the regeneration of nerve and brain tissue, 

10 i.e. for the treatment of central and peripheral nervous system diseases and neuropathies, as 
well as mechanical and traumatic disorders which involve degeneration, death or trauma to 
neural cells or nerve tissue. In addition, the expanded stem cell populations can also be 
genetically altered for gene therapy purposes and to decrease host rejection of replacement 
tissues after grafting or implantation. 

15 Expression of the polypeptide of the invention and its effect on stem cells can also be 

manipulated to achieve controlled differentiation of the stem cells into more differentiated 
cell types. A broadly applicable method of obtaining pure populations of a specific 
differentiated cell type from undifferentiated stem cell populations involves the use of a cell- 
type specific promoter driving a selectable marker. The selectable marker allows only cells 

20 of the desired type to survive. For example, stem cells can be induced to differentiate into 
cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. 
Invest., 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. W. In: Principles of 
Tissue Engineering eds. Lanza et al., Academic Press (1997)). Alternatively, directed 
differentiation of stem cells can be accomplished by culturing the stem cells in the presence 

25 of a differentiation factor such as retinoic acid and an antagonist of the polypeptide of the 
invention which would inhibit the effects of endogenous stem cell factor activity and allow 
differentiation to proceed. | 

In vitro cultures of stem cells can be used to determine if the polypeptide of the 
invention exhibits stem cell growth factor activity. Stem cells are isolated from any one of 

30 various cell sources (including hematopoietic stem cells and embryonic stem cells) and 
cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 
92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in 
combination with other growth factors or cytokines. The ability of the polypeptide of the 
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invention to induce stem cells proliferation is determined by colony formation on semi-solid 
support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPODESIS REGULATING ACTIVITY 

5 A polypeptide of the present invention may be involved in regulation of 

hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell disorders. 
Even marginal biological activity in support of colony forming cells or of factor-dependent 
cell lines indicates involvement in regulating hematopoiesis, e.g. in supporting the growth 
and proliferation of erythroid progenitor cells alone or in combination with other cytokines, 

10 thereby indicating utility, for example, in treating various anemias or for use in conjunction 
with irradiation/chemotherapy to stimulate the production of erythroid precursors and/or 
erythroid cells; in supporting the growth and proliferation of myeloid cells such as 
granulocytes and monocytes/macrophages (i.e., traditional CSF activity) useful, for example, 
in conjunction with chemotherapy to prevent or treat consequent myelo-suppression; in 

15 supporting the growth and proliferation of megakaryocytes and consequently of platelets 
thereby allowing prevention or treatment of various platelet disorders such as 
thrombocytopenia, and generally for use in place of or complimentary to platelet 
transfusions; and/or in supporting the growth and proliferation of hematopoietic stem cells 
which are capable of maturing to any and all of the above-mentioned hematopoietic cells and 

20 therefore find therapeutic utility in various stem cell disorders (such as those usually treated 
with transplantation, including, without limitation, aplastic anemia and paroxysmal nocturnal 
hemoglobinuria), as well as in repopulating the stem cell compartment post 
irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or 

25 heterologous)) as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 
Suitable assays for proliferation and differentiation of various hematopoietic lines are 
cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
30 proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., 
Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 
1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. Li Culture of Hematopoietic Cells. 
R. L Freshney, et al eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; 
5 Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic 
colony forming cells with high proliferative potential, McNiece, I. K. and Briddell, R. A. In 
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., 
New York, N.Y. 1994; Neben et al., Experimental Hematology 22:353-359, 1994; 
Cobblestone area forming cell assay, Ploemacher, R. E. In Culture of Hematopoietic Cells. 

10 R. I. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, Inc., New York, N.Y. 1994; Long term 
bone marrow cultures in the presence of stromal cells, Spooncer, E., Dexter, M. and Allen, 
T. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, 
Inc., New York, N.Y. 1994; Long term culture initiating cell assay, Sutherland, H. J. In 
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 139-162, Wiley-Liss, Inc., 

15 New York, N.Y. 1994. 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, 
tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing and 

20 tissue repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 

25 prophylactic use in closed as well as open fracture reduction and also in the improved 
fixation of artificial joints. De novo bone formation induced by an osteogenic agent 
contributes to the repair of congenital, trauma induced, or oncologic resection induced 
craniofacial defects, and also is useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming 

30 cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by 
blocking inflammation or processes of tissue destruction (collagenase activity, osteoclast 
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activity, etc.) mediated by inflammatory processes may also be possible using the 
composition of the invention. 

Another category of tissue regeneration activity that may involve the polypeptide of 
the present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue 

5 or other tissue formation in circumstances where such tissue is not normally formed, has 
application in the healing of tendon or ligament tears, deformities and other tendon or 
ligament defects in humans and other animals. Such a preparation employing a 
tendon/hgament-like tissue inducing protein may have prophylactic use in preventing 
damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or 

10 ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. De 
novo tendon/ligament-like tissue formation induced by a composition of the present 
invention contributes to the repair of congenital, trauma induced, or other tendon or ligament 
defects of other origin, and is also useful in cosmetic plastic surgery for attachment or repair 
of tendons or ligaments. The compositions of the present invention may provide 

1 5 environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or Ugament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to 
effect tissue repair. The compositions of the invention may also be useful in the treatment of 
tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The compositions 

20 may also include an appropriate matrix and/or sequestering agent as a carrier as is well 
known in the art. 

The compositions of the present invention may also be useful for prohferation of 
neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and 

25 traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve 
tissue. More specifically, a composition may be used in the treatment of diseases of the 
peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases, such as Alzheimer's, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager 

30 syndrome. Further conditions which may be treated in accordance with the present invention 
include mechanical and traumatic disorders, such as spinal cord disorders, head trauma and 
cerebrovascular diseases such as stroke. Peripheral neuropathies resulting from 
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chemotherapy or other medical therapies may also be treatable using a composition of the 
invention. 

Compositions of the invention may also be useful to promote better or faster closure 
of non-healing wounds, including without limitation pressure ulcers, ulcers associated with 
5 vascular insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, 
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular 
(including vascular endothelium) tissue, or for promoting the growth of cells comprising 
10 such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic 

scarring may allow normal tissue to regenerate. A polypeptide of the present invention may 
also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
1 5 conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or 
inhibiting differentiation of tissues described above from precursor tissues or cells; or for 
inhibiting the growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 
20 Assays for tissue generation activity include, without limitation, those described in: 

International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International 
Patent Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: 
25 Winter, Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. L and Rovee, D. T., eds.), 

Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. 
Dermatol 71:382-84 (1978). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

30 A polypeptide of the present invention may also exhibit immune stimulating or 

immune suppressing activity, including without limitation the activities for which assays are 
described herein. A polynucleotide of the invention can encode a polypeptide exhibiting 
such activities. A protein may be useful in the treatment of various immune deficiencies and 
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disorders (including severe combined immunodeficiency (SCJJD)), e.g., in regulating (up or 
down) growth and proliferation of T and/or B lymphocytes, as well as effecting the cytolytic 
activity of NK cells and other cell populations. These immune deficiencies may be genetic or 
be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or may result from 
5 autoimmune disorders. More specifically, infectious diseases causes by viral, bacterial, 
fungal or other infection may be treatable using a protein of the present invention, including 
infections by HTV, hepatitis viruses, herpes viruses, mycobacteria, Leishmania spp., malaria 
spp. and various fungal infections such as candidiasis. Of course, in this regard, proteins of 
the present invention may also be useful where a boost to the immune system generally may 
1 0 be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus 
erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre 
syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, 
1 5 graft-versus-host disease and autoimmune inflammatory eye disease. Such a protein (or 
antagonists thereof, including antibodies) of the present invention may also to be useful in 
the treatment of allergic reactions and conditions (e.g., anaphylaxis, serum sickness, drug 
reactions, food allergies, insect venom allergies, mastocytosis, allergic rhinitis, 
hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic dermatitis, allergic 
20 contact dermatitis, erythema multiforme, Stevens-Johnson syndrome, allergic conjunctivitis, 
atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant papillary conjunctivitis and 
contact allergies), such as asthma (particularly allergic asthma) or other respiratory 
problems. Other conditions, in which immune suppression is desired (including, for 
example, organ transplantation), may also be treatable using a protein (or antagonists 
25 thereof) of the present invention. The therapeutic effects of the polypeptides or antagonists 
thereof on allergic reactions can be evaluated by in vivo animals models such as the 
cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 1998), skin 
prick test (Hoffmann et al., Allergy 54: 446-54, 1 999), guinea pig skin sensitization test 
(Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
30 J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or 
blocking an immune response already in progress or may involve preventing the induction of 
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an immune response. The functions of activated T cells may be inhibited by suppressing T 
cell responses or by inducing specific tolerance in T cells, or both. Immunosuppression of T 
cell responses is generally an active, non-antigen-specific, process which requires continuous 
exposure of the T cells to the suppressive agent. Tolerance, which involves inducing 
5 non-responsiveness or anergy in T cells, is distinguishable from immunosuppression in that 
it is generally antigen-specific and persists after exposure to the tolerizing agent has ceased. 
Operationally, tolerance can be demonstrated by the lack of a T cell response upon 
reexposure to specific antigen in the absence of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

10 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin 
and organ transplantation and in graft-versus-host disease (GVHD). For example, blockage 
of T cell function should result in reduced tissue destruction in tissue transplantation. 
Typically, in tissue transplants, rejection of the transplant is initiated through its recognition 

15 as foreign by T cells, followed by an immune reaction that destroys the transplant. The 

administration of a therapeutic composition of the invention may prevent cytokine synthesis 
by immune cells, such as T cells, and thus acts as an immunosuppressant. Moreover, a lack 
of costimulation may also be sufficient to anergize the T cells, thereby inducing tolerance in 
a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking reagents may 

20 avoid the necessity of repeated administration of these blocking reagents. To achieve 

sufficient immunosuppression or tolerance in a subject, it may also be necessary to block the 
function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 

25 humans. Examples of appropriate systems which can be used include allogeneic cardiac 
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been 
used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as 
described in Lenschow et al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. 
Sci USA, 89:11102-11105 (1992). In addition, murine models of GVHD (see Pauled., 

30 Fundamental Immunology, Raven Press, New York, 1989, pp. 846-847) can be used to 

determine the effect of therapeutic compositions of the invention on the development of that 
disease. 
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Blocking antigen function may also be therapeutically useful for treating 
autoimmune diseases. Many autoimmune disorders are the result of inappropriate activation 
of T cells that are reactive against self-tissue and which promote the production of cytokines 
and autoantibodies involved in the pathology of the diseases. Preventing the activation of 
5 autoreactive T cells may reduce or eliminate disease symptoms. Administration of reagents 
which block stimulation of T cells can be used to inhibit T cell activation and prevent 
production of autoantibodies or T cell-derived cytokines which may be involved in the 
disease process. Additionally, blocking reagents may induce antigen-specific tolerance of 
autoreactive T cells which could lead to long-term relief from the disease. The efficacy of 

10 blocking reagents in preventing or alleviating autoimmune disorders can be determined 
using a number of well-characterized animal models of human autoimmune diseases. 
Examples include murine experimental autoimmune encephalitis, systemic lupus 
erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune collagen 
arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental myasthenia 

15 gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of 
immune responses may be in the form of enhancing an existing immune response or eliciting 

20 an initial immune response. For example, enhancing an immune response may be useful in 
cases of viral infection, including systemic viral diseases such as influenza, the common 
cold, and encephalitis. 

Alternatively, anti- viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 

25 APCs either expressing a peptide of the present invention or together with a stimulatory 
form of a soluble peptide of the present invention and reintroducing the in vitro activated T 
cells into the patient. Another method of enhancing anti-viral immune responses would be to 
isolate infected cells from a patient, transfect them with a nucleic acid encoding a protein of 
the present invention as described herein such that the cells express all or a portion of the 

30 protein on their surface, and reintroduce the transfected cells into the patient. The infected 
cells would now be capable of delivering a costimulatory signal to, and thereby activate, T 
cells in vivo. 
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A polypeptide of the present invention may provide the necessary stimulation signal 
to T cells to induce a T cell mediated immune response against the transfected tumor cells. 
In addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected 
5 with nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) 
of an MHC class I alpha chain protein and p 2 microglobulin protein or an MHC class II 
alpha chain protein and an MHC class II beta chain protein to thereby express MHC class I 
or MHC class H proteins on the cell surface. Expression of the appropriate class I or class II 
MHC in conjunction with a peptide having the activity of a B lymphocyte antigen (e.g., 

10 B7-1, B7-2, B7-3) induces a T cell mediated immune response against the transfected tumor 
cell. Optionally, a gene encoding an antisense construct which blocks expression of an MHC 
class II associated protein, such as the invariant chain, can also be cotransfected with a DNA 
encoding a peptide having the activity of a B lymphocyte antigen to promote presentation of 
tumor associated antigens and induce tumor specific immunity. Thus, the induction of a T 

1 5 cell mediated immune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 

20 those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 

25 Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al, 
J. Immunol. 140:508-512, 1988; Bowman et al, J. Virology 61:1992-1998; Bertagnolli et 
al. Cellular Immunology 133:327-341, 1991; Brown et al, J. Immunol. 153:3079-3092, 
1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching 
30 (which will identify, among others, proteins that modulate T-cell dependent antibody 

responses and that affect Thl/Th2 profiles) include, without limitation, those described in: 
Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro 
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antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. 
E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, 
proteins that generate predominantly Thl and CTL responses) include, without limitation, 
5 those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; 
Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 
10 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins 
expressed by dendritic cells that activate naive T-cells) include, without limitation, those 
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of 
Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of Immunology 

15 154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-260, 
1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al., Science 
264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-1264, 
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et al., 
Journal of Experimental Medicine 172:631-640, 1990. 

20 Assays for lymphocyte survival/apoptosis (which will identify, among others, 

proteins that prevent apoptosis after superantigen induction and proteins that regulate 
lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz et 
al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et 
al., Cancer Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, 

25 Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; 
Gorczyca et al., International Journal of Oncology 1 :639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al., Blood 84:1 1 1-117, 1994; Fine 
et al., Cellular Immunology 155:1 11-122, 1994; Galy et al., Blood 85:2770-2778, 1995; 

30 Toki et al., Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate 
5 the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present 

invention, alone or in heterodimers with a member of the inhibin family, may be useful as a 
contraceptive based on the ability of inhibins to decrease fertility in female mammals and 
decrease spermatogenesis in male mammals. Administration of sufficient amounts of other 
inhibins can induce infertility in these mammals. Alternatively, the polypeptide of the 

10 invention, as a homodimer or as a heterodimer with other protein subunits of the inhibin 
group, maybe useful as a fertility inducing therapeutic, based upon the ability of activin 
molecules in stimulating FSH release from cells of the anterior pituitary. See, for example, 
U.S. Pat. No. 4,798,885. A polypeptide of the invention may also be useful for advancement 
of the onset of fertility in sexually immature mammals, so as to increase the lifetime 

15 reproductive performance of domestic animals such as, but not limited to, cows, sheep and 
pigs. 

The activity of a polypeptide of the invention may, among other means, be measured 
by the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: 
20 Vale et al., Endocrinology 91 :562-572, 1972; Ling et al., Nature 321 :779-782, 1986; Vale et 
al, Nature 321:776-779, 1986; Mason et al, Nature 318:659-663, 1985; Forage et al., Proc. 
Natl. Acad. Sci. USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

25 A polypeptide of the present invention may be involved in chemotactic or 

chemokinetic activity for mammalian cells, including, for example, monocytes, fibroblasts, 
neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. 
Chemotactic and chemokinetic receptor activation can be used to mobilize or attract a 

30 desired cell population to a desired site of action. Chemotactic or chemokinetic compositions 
(e.g. proteins, antibodies, binding partners, or modulators of the invention) provide particular 
advantages in treatment of wounds and other trauma to tissues, as well as in treatment of 
localized infections. For example, attraction of lymphocytes, monocytes or neutrophils to 



WO 03/080795 



PCT/US02/25485 



59 

tumors or sites of infection may result in improved immune responses against the tumor or 
infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
stimulate, directly or indirectly, the directed orientation or movement of such cell 
5 population. Preferably, the protein or peptide has the ability to directly stimulate directed 
movement of cells. Whether a particular protein has chemotactic activity for a population of 
cells can be readily determined by employing such protein or peptide in any known assay for 
cell chemotaxis. 

Therapeutic compositions of the invention can be used in the following: 
10 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of 
cells across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. 
15 Coligan, A. M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene 

Publishing Associates and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta 
Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. 
APMIS 103:140-146, 1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of 
Immunol. 152:5860-5867, 1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

20 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders 

25 (including hereditary disorders, such as hemophilias) or to enhance coagulation and other 
hemostatic events in treating wounds resulting from trauma, surgery or other causes. A 
composition of the invention may also be useful for dissolving or inhibiting formation of 
thromboses and for treatment and prevention of conditions resulting therefrom (such as, for 
example, infarction of cardiac and central nervous system vessels (e.g., stroke). 

30 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al, Thrombosis 
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Res. 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, 
Prostaglandins 35:467-474, 1988. 



4.10.11 CANCER DIAGNOSIS AND THERAPY 

5 Polypeptides of the invention may be involved in cancer cell generation, proliferation 

or metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. 
For example, the presence or increased expression of a polynucleotide/polypeptide of the 
invention may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing 
10 malignancy. Conversely, a defect in the gene or absence of the polypeptide may be 
associated with a cancer condition. Identification of single nucleotide polymorphisms 
associated with cancer or a predisposition to cancer may also be useful for diagnosis or 
prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

15 inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor 
growth) and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. 
Therapeutic compositions of the invention may be effective in adult and pediatric oncology 
including in solid phase tumors/malignancies, locally advanced tumors, human soft tissue 
sarcomas, metastatic cancer, including lymphatic metastases, blood cell malignancies 

20 including multiple myeloma, acute and chronic leukemias, and lymphomas, head and neck 
cancers including mouth cancer, larynx cancer and thyroid cancer, lung cancers including 
small cell carcinoma and non-small cell cancers, breast cancers including small cell . 
carcinoma and ductal carcinoma, gastrointestinal cancers including esophageal cancer, 
stomach cancer, colon cancer, colorectal cancer and polyps associated with colorectal 

25 neoplasia, pancreatic cancers, liver cancer, urologic cancers including bladder cancer and 
prostate cancer, malignancies of the female genital tract including ovarian carcinoma, uterine 
(including endometrial) cancers, and solid tumor in the ovarian follicle, kidney cancers 
including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

30 nervous system, bone cancers including osteomas, skin cancers including malignant 

melanoma, tumor progression of human skin keratinocytes, squamous cell carcinoma, basal 
cell carcinoma, hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention 
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(including inhibitors and stimulators of the biological activity of the polypeptide of the 
invention) may be administered to treat cancer. Therapeutic compositions can be 
administered in therapeutically effective dosages alone or in combination with adjuvant 
cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and laser 
5 therapy, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of tumor 
growth, inhibiting metastasis, or otherwise improving overall clinical condition, without 
necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 

10 modulator of the invention with one or more anti-cancer drugs in addition to a 

pharmaceutically acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer 
treatment is routine. Anti-cancer drugs that are well known in the art and can be used as a 
treatment in combination with the polypeptide or modulator of the invention include: 
Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, Carboplatin, 

15 Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, Cytarabine HC1 

(Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HC1, Doxorubicin HC1, 
Estramustine phosphate sodium, Etoposide (VI 6-2 13), Floxuridine, 5-Fluorouracil (5-Fu), 
Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon Alpha-2a, Interferon 
Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), Lomustine, Mechlorethamine 

20 HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, Methotrexate (MTX), 

Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, Streptozocin, 
Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 
Semustine, Teniposide, and Vindesine sulfate. 

25 In addition, therapeutic compositions of the invention may be used for prophylactic 

treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing 
cancers. Under these circumstances, it may be beneficial to treat these individuals with 
therapeutically effective doses of the polypeptide of the invention to reduce the risk of 

30 developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays 
of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) 
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Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 
and Ch 21), tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 
52: 921-30 (1974), mobility and invasive potential of tumor cells in Boyden Chamber assays 
as described in Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis 
5 assays such as induction of vascularization of the chick chorioallantoic membrane or 
induction of vascular endothelial cell migration as described in Ribatta et al., Intl. J. Dev. 
Biol., 40: 1189-97 (1999) and Li et al., Clin. Exp. Metastasis, 17:423-9 (1999), respectively. 
Suitable tumor cells lines are available, e.g. from American Type Tissue Culture Collection 
catalogs. 

10 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of 
the invention can encode a polypeptide exhibiting such characteristics. Examples of such 

15 receptors and ligands include, without limitation, cytokine receptors and their ligands, 
receptor kinases and their ligands, receptor phosphatases and their ligands, receptors 
involved in cell-cell interactions and their ligands (including without limitation, cellular 
adhesion molecules (such as selectins, integrins and their ligands) and receptor/ligand pairs 
involved in antigen presentation, antigen recognition and development of cellular and 

20 humoral immune responses. Receptors and ligands are also useful for screening of potential 
peptide or small molecule inhibitors of the relevant receptor/ligand interaction. A protein of 
the present invention (including, without limitation, fragments of receptors and ligands) may 
themselves be useful as inhibitors of receptor/ligand interactions. 

The activity of a polypeptide of the invention may, among other means, be measured 

25 by the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- 
Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static conditions 

30 7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., 
J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al., J. Exp. Med. 169:149-160 1989; 
Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be 
identified through binding assays, affinity chromatography, dihybrid screening assays, 
BIAcore assays, gel overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or 
a partial antagonist require the use of other proteins as competing ligands. The polypeptides 
of the present invention or Ugand(s) thereof may be labeled by being coupled to 
radioisotopes, colorimetric molecules or a toxin molecules by conventional methods. 
("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 
(1990) Academic Press, Inc. San Diego). Examples of radioisotopes include, but are not 
limited to, tritium and carbon- 14 . Examples of colorimetric molecules include, but are not 
limited to, fluorescent molecules such as fluorescamine, or rhodamine or other colorimetric 
molecules. Examples of toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening 
techniques. The polypeptides or fragments employed in such a test may either be free in 
solution, affixed to a solid support, borne on a cell surface or located intracellularly. One 
method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably 
transformed with recombinant nucleic acids expressing the polypeptide or a fragment 
thereof Drugs are screened against such transformed cells in competitive binding assays. 
Such cells, either in viable or fixed form, can be used for standard binding assays. One may 
measure, for example, the formation of complexes between polypeptides of the invention or 
fragments and the agent being tested or examine the diminution in complex formation 
between the novel polypeptides and an appropriate cell line, which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate 
(i.e., increase or decrease) the activity of polypeptides of the invention include (1) inorganic 
and organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 
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The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or 
marine microorganisms or (2) extraction of the organisms themselves. Natural product 
5 libraries include polyketides, non-ribosomal peptides, and (non-naturally occurring) variants 
thereof. For a review, see Science 252:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides 
or organic compounds and can be readily prepared by traditional automated synthesis 
methods, PCR, cloning or proprietary synthetic methods. Of particular interest are peptide 

10 and oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, 
protein, peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide 
libraries. For a review of combinatorial chemistry and libraries created therefrom, see 
Myers, Curr. Opin. Biotechnol 8:701-707 (1997). For reviews and examples of 
peptidomimetic libraries, see Al-Obeidi et aL, Mol Biotechnol, 9(3):205-23 (1998); Hruby 

15 et al., Curr Opin Chem Biol, 1(1):1 14-19 (1997); Domer et aL, Bioorg Med Chern, 
4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein 
permits modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" 
to bind a polypeptide of the invention. The molecules identified in the binding assay are then 

20 tested for antagonist or agonist activity in in vivo tissue culture or animal models that are 
well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or 
animals and then tested for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The 

25 toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity of 
the binding molecule for a polypeptide of the invention. Alternatively, the binding 
molecules may be complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

30 The invention also provides methods to detect specific binding of a polypeptide e.g. a 

ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For 
example, expression cloning using mammalian or bacterial cells, or dihybrid screening 
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assays can be used to identify polynucleotides encoding binding partners. As another 
example, affinity chromatography with the appropriate immobilized polypeptide of the 
invention can be used to isolate polypeptides that recognize and bind polypeptides of the 
invention. There are a number of different libraries used for the identification of 
5 compounds, and in particular small molecules, that modulate (i.e., increase or decrease) 

biological activity of a polypeptide of the invention. Ligands for receptor polypeptides of the 
invention can also be identified by adding exogenous ligands, or cocktails of ligands to two 
cells populations that are genetically identical except for the expression of the receptor of the 
invention: one cell population expresses the receptor of the invention whereas the other does 

10 not. The responses of the two cell populations to the addition of ligands(s) are then 

compared. Alternatively, an expression library can be co-expressed with the polypeptide of 
the invention in cells and assayed for an autocrine response to identify potential ligand(s). As 
still another example, BIAcore assays, gel overlay assays, or other methods known in the art 
can be used to identify binding partner polypeptides, including, (1) organic and inorganic 

15 chemical libraries, (2) natural product libraries, and (3) combinatorial libraries comprised of 
random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of 
the polypeptide of the invention can be determined. For example, a chimeric protein in 
which the cytoplasmic domain of the polypeptide of the invention is fused to the 

20 extracellular portion of a protein, whose ligand has been identified, is produced in a host 
cell. The cell is then incubated with the ligand specific for the extracellular portion of the 
chimeric protein, thereby activating the chimeric receptor. Known downstream proteins 
involved in intracellular signaling can then be assayed for expected modifications i.e. 
phosphorylation. Other methods known to those in the art can also be used to identify 

25 signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. 
The anti-inflammatory activity may be achieved by providing a stimulus to cells involved in 
30 the inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for 
example, cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the 
inflammatory process, inhibiting or promoting cell extravasation, or by stimulating or 
suppressing production of other factors which more directly inhibit or promote an 
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inflammatory response. Compositions with such activities can be used to treat inflammatory 
conditions including chronic or acute conditions), including without limitation intimation 
associated with infection (such as septic shock, sepsis or systemic inflammatory response 
syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis, 
5 complement-mediated hyperacute rejection, nephritis, cytokine or chemokine-induced lung 
injury, inflammatory bowel disease, Crohn's disease or resulting from over production of 
cytokines such as TNF or IL-1. Compositions of the invention may also be useful to treat 
anaphylaxis and hypersensitivity to an antigenic substance or material. Compositions of this 
invention may be utilized to prevent or treat conditions such as, but not limited to, sepsis, 

10 acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid arthritis, chronic 
inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, graft versus 
host disease, inflammatory bowel disease, inflamation associated with pulmonary disease, 
other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

1 5 intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of 
20 the invention. Such leukemias and related disorders include but are not limited to acute 
leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, 
promyelocyte, myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic 
myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such 
disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

25 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
30 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient 
(including human and non-human mammalian patients) according to the invention include 
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but are not limited to the following lesions of either the central (including spinal cord, brain) 
or peripheral nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated 
with surgery, for example, lesions which sever a portion of the nervous system, or 

5 compression injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
10 injured as a result of infection, for example, by an abscess or associated with infection by 

human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme 
disease, tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration 

15 associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or 
amyotrophic lateral sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion of 
the nervous system is destroyed or injured by a nutritional disorder or disorder of 
metabolism including but not limited to, vitamin B 12 deficiency, folic acid deficiency, 

20 Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not 
limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, 
carcinoma, or sarcoidosis; 

25 (vii) lesions caused by toxic substances including alcohol, lead, or particular 

neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is destroyed or 
injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various 
30 etiologies, progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 
system disorder may be selected by testing for biological activity in promoting the survival 
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or differentiation of neurons. For example, and not by way of limitation, therapeutics which 
elicit any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, 

e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method 

10 set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons 
may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or 
Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of 
neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody 
binding, Northern blot assay, etc., depending on the molecule to be measured; and motor 

1 5 neuron dysfunction may be measured by assessing the physical manifestation of motor 

neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to 
toxin, trauma, surgical damage, degenerative disease or malignancy that may affect motor 

20 neurons as well as other components of the nervous system, as well as disorders that 

selectively affect neurons such as amyotrophic lateral sclerosis, and including but not limited 
to progressive spinal muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, 
infantile and juvenile muscular atrophy, progressive bulbar paralysis of childhood (Fazio- 
Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary Motorsensory 

25 Neuropathy (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following 
additional activities or effects: inhibiting the growth, infection or function of, or killing, 
30 infectious agents, including, without limitation, bacteria, viruses, fungi and other parasites; 
effecting (suppressing or enhancing) bodily characteristics, including, without limitation, 
height, weight, hair color, eye color, skin, fat to lean ratio or other tissue pigmentation, or 
organ or body part size or shape (such as, for example, breast augmentation or diminution, 
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change in bone form or shape); effecting biorhythxns or circadian cycles or rhythms; 
effecting the fertility of male or female subjects; effecting the metabolism, catabolism, 
anabolism, processing, utilization, storage or elimination of dietary fat, lipid, protein, 
carbohydrate, vitamins, minerals, co-factors or other nutritional factors or component(s); 
5 effecting behavioral characteristics, including, without limitation, appetite, libido, stress, 
cognition (including cognitive disorders), depression (including depressive disorders) and 
violent behaviors; providing analgesic effects or other pain reducing effects; promoting 
differentiation and growth of embryonic stem cells in lineages other than hematopoietic 
lineages; hormonal or endocrine activity; in the case of enzymes, correcting deficiencies of 
10 the enzyme and treating deficiency-related diseases; treatment of hyperproliferative 
disorders (such as, for example, psoriasis); immunoglobulin-like activity (such as, for 
example, the ability to bind antigens or complement); and the ability to act as an antigen in a 
vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

15 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for 
diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential 

20 predisposition or susceptibility to various disease states (such as disorders involving 

inflammation or immune response) or a differential response to drug administration, and this 
genetic information can be used to tailor preventive or therapeutic treatment appropriately. 
For example, the existence of a polymorphism associated with a predisposition to 
inflammation or autoimmune disease makes possible the diagnosis of this condition in 

25 humans by identifying the presence of the polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, 
optionally involving isolation or amplification of the DNA, and identifying the presence of 
the polymorphism in the DNA. For example, PCR may be used to amplify an appropriate 

30 fragment of genomic DNA which may then be sequenced. Alternatively, the DNA may be 
subjected to allele-specific oligonucleotide hybridization (in which appropriate 
oligonucleotides are hybridized to the DNA under conditions permitting detection of a single 
base mismatch) or to a single nucleotide extension assay (in which an oligonucleotide that 
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hybridizes immediately adjacent to the position of the polymorphism is extended with one or 
more labeled nucleotides). Li addition, traditional restriction fragment length polymorphism 
analysis (using restriction enzymes that provide differential digestion of the genomic DNA 
depending on the presence or absence of the polymorphism) may be performed. Arrays with 
5 nucleotide sequences of the present invention can be used to detect polymorphisms. The 
array can comprise modified nucleotide sequences of the present invention in order to detect 
the nucleotide sequences of the present invention. In the alternative, any one of the 
nucleotide sequences of the present invention can be placed on the array to detect changes 
from those sequences. 

10 Alternatively a polymorphism resulting in a change in the amino acid sequence could 

also be detected by detecting a corresponding change in amino acid sequence of the protein, 
e.g., by an antibody specific to the variant sequence. 

4.10,20 ARTHRITIS AND INFLAMMATION 

15 The immunosuppressive effects of the compositions of the invention against 

rheumatoid arthritis is determined in an experimental animal model system. The 
experimental model system is adjuvant induced arthritis in rats, and the protocol is described 
by J. Holoshitz, et at, 1983, Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. 
Allergy Appl. Immunol., 23:129. Induction of the disease can be caused by a single 

20 injection, generally intradermally, of a suspension of killed Mycobacterium tuberculosis in 
complete Freund's adjuvant (CFA). The route of injection can vary, but rats may be injected 
at the base of the tail with an adjuvant mixture. The polypeptide is administered in phosphate 
buffered solution (PBS) at a dose of about 1-5 mg/kg. The control consists of administering 
PBS only. 

25 The procedure for testing the effects of the test compound would consist of 

intradermally injecting killed Mycobacterium tuberculosis in CFA followed by immediately 
administering the test compound and subsequent treatment every other day until day 24. At 
14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an overall arthritis 
score may be obtained as described by J. Holoskitz above. An analysis of the data would 

30 reveal that the test compound would have a dramatic affect on the swelling of the joints as 
measured by a decrease of the arthritis score. 

4.11 THERAPEUTIC METHODS 

k 
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The compositions (including polypeptide fragments, analogs, variants and antibodies 
or other binding partners or modulators including antisense polynucleotides) of the invention 
have numerous applications in a variety of therapeutic methods. Examples of therapeutic 
applications include, but are not limited to, those exemplified herein. 

5 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode 

10 of admimstration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, 
weight, condition and response of the individual patient. Typically, the amount of 

1 5 polypeptide administered per dose will be in the range of about 0.01 p,g/kg to 100 mg/kg of 
body weight, with the preferred dose being about 0.1(ig/kg to 10 mg/kg of patient body 
weight. For parenteral administration, polypeptides of the invention will be formulated in an 
injectable form combined with a pharmaceutically acceptable parenteral vehicle. Such 
vehicles are well known in the art and examples include water, saline, Ringer's solution, 

20 dextrose solution, and solutions consisting of small amounts of the human serum albumin. 
The vehicle may contain minor amounts of additives that maintain the isotonicity and 
stability of the polypeptide or other active ingredient. The preparation of such solutions is 
within the skill of the art. 

25 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 

ADMINISTRATION 

A protein or other composition of the present invention (from whatever source 
derived, including without limitation from recombinant and non-recombinant sources and 
including antibodies and other binding partners of the polypeptides of the invention) may be 
30 administered to a patient in need, by itself, or in pharmaceutical compositions where it is 
mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a variety of 
disorders. Such a composition may optionally contain (in addition to protein or other active 
ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers, and other 
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materials well known in the art. The term "pharmaceutical^ acceptable" means a non-toxic 
material that does not interfere with the effectiveness of the biological activity of the active 
ingredient(s). The characteristics of the carrier will depend on the route of administration. 
The pharmaceutical composition of the invention may also contain cytokines, lymphokines, 
5 or other hematopoietic factors such as M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, 
IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IFN, TNFO, TNF1, TNF2, 
G-CSF, Meg-CSF, thrombopoietin, stem cell factor, and erythropoietin. In further 
compositions, proteins of the invention may be combined with other agents beneficial to the 
treatment of the disease or disorder in question. These agents include various growth factors 
10 such as epidermal growth factor (EGF), platelet-derived growth factor (PDGF), transforming 
growth factors (TGF-oc and TGF-P), insulin-like growth factor (IGF), as well as cytokines 
described herein. 

The pharmaceutical composition may further contain other agents which either 
enhance the activity of the protein or other active ingredient or complement its activity or 

15 use in treatment. Such additional factors and/or agents may be included in the 

pharmaceutical composition to produce a synergistic effect with protein or other active 
ingredient of the invention, or to minimize side effects. Conversely, protein or other active 
ingredient of the present invention may be included in formulations of the particular clotting 
factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic 

20 factor, or anti- inflammatory agent to minimize side effects of the clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or 
anti-inflammatory agent (such as IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, 
immunosuppressive agents). A protein of the present invention may be active in multimers 
(e.g., heterodimers or homodimers) or complexes with itself or other proteins. As a result, 

25 pharmaceutical compositions of the invention may comprise a protein of the invention in 
such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 

30 therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application 
may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, 
latest edition. A therapeutically effective dose further refers to that amount of the compound 
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sufficient to result in amelioration of symptoms, eg., treatment, healing, prevention or 
amelioration of the relevant medical condition, or an increase in rate of treatment, healing, 
prevention or amelioration of such conditions. When applied to an individual active 
ingredient, administered alone, a therapeutically effective dose refers to that ingredient 
5 alone. When applied to a combination, a therapeutically effective dose refers to combined 
amounts of the active ingredients that result in the therapeutic effect, whether administered 
in combination, serially or'simultaneously. 

In practicing the method of treatment or use of the present invention, a 
therapeutically effective amount of protein or other active ingredient of the present invention 

10 is administered to a mammal having a condition to be treated. Protein or other active 

ingredient of the present invention may be administered in accordance with the method of 
the invention either alone or in combination with other therapies such as treatments 
employing cytokines, lymphokines or other hematopoietic factors. When co- administered 
with one or more cytokines, lymphokines or other hematopoietic factors, protein or other 

1 5 active ingredient of the present invention may be administered either simultaneously with 
the cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or 
anti-thrombotic factors, or sequentially. If administered sequentially, the attending physician 
will decide on the appropriate sequence of administering protein or other active ingredient of 
the present invention in combination with cytokine(s), lymphokine(s), other hematopoietic 

20 factor(s), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, 
transmucosal, or intestinal administration; parenteral delivery, including intramuscular, 

25 subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, 

intravenous, intraperitoneal, intranasal, or intraocular injections. Administration of protein 
or other active ingredient of the present invention used in the pharmaceutical composition or 
to practice the method of the present invention can be carried out in a variety of conventional 
ways, such as oral ingestion, inhalation, topical application or cutaneous, subcutaneous, 

30 intraperitoneal, parenteral or intravenous injection. Intravenous administration to the patient 
is preferred. 

Alternately, one may administer the compound in a local rather than systemic 
manner, for example, via injection of the compound directly into a arthritic joints or in 
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fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the 
scarring process frequently occurring as complication of glaucoma surgery, the compounds 
may be adininistered topically, for example, as eye drops. Furthermore, one may administer 
the drug in a targeted drug delivery system, for example, in a liposome coated with a specific 

5 antibody, targeting, for example, arthritic or fibrotic tissue. The liposomes will be targeted 
to and taken up selectively by the afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an 
effective dosage to the desired site of action. The determination of a suitable route of 
administration and an effective dosage for a particular indication is within the level of skill 

10 in the art. Preferably for wound treatment, one administers the therapeutic compound 
directly to the site. Suitable dosage ranges for the polypeptides of the invention can be 
extrapolated from these dosages or from similar studies in appropriate animal models. 
Dosages can then be adjusted as necessary by the clinician to provide maximal therapeutic 
benefit. 

15 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus 
may be formulated in a conventional manner using one or more physiologically acceptable 
carriers comprising excipients and auxiliaries which facilitate processing of the active 

20 compounds into preparations which can be used pharmaceutically. These pharmaceutical 
compositions may be manufactured in a manner that is itself known, e.g. , by means of 
conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, 
encapsulating, entrapping or lyophilizing processes. Proper formulation is dependent upon 
the route of administration chosen. When a therapeutically effective amount of protein or 

25 olher active ingredient of the present invention is administered orally, protein or other active 
ingredient of the present invention will be in the form of a tablet, capsule, powder, solution 
or elixir. When administered in tablet form, the pharmaceutical composition of the invention 
may additionally contain a solid carrier such as a gelatin or an adjuvant. The tablet, capsule, 
and powder contain from about 5 to 95% protein or other active ingredient of the present 

30 invention, and preferably from about 25 to 90% protein or other active ingredient of the 
present invention. When administered in liquid form, a liquid carrier such as water, 
petroleum, oils of animal or plant origin such as peanut oil, mineral oil, soybean oil, or 
sesame oil, or synthetic oils may be added. The liquid form of the pharmaceutical 
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composition may further contain physiological saline solution, dextrose or other saccharide 
solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. When 
administered in liquid form, the pharmaceutical composition contains from about 0.5 to 90% 
by weight of protein or other active ingredient of the present invention, and preferably from 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, 
protein or other active ingredient of the present invention will be in the form of a 
pyrogen-free, parenterally acceptable aqueous solution. The preparation of such parenterally 
1 0 acceptable protein or other active ingredient solutions, having due regard to pH, isotonicity, 
stability, and the like, is within the skill in the art. A preferred pharmaceutical composition 
for intravenous, cutaneous, or subcutaneous injection should contain, in addition to protein 
or other active ingredient of the present invention, an isotonic vehicle such as Sodium 
Chloride Injection, Ringer's Injection, Dextrose Injection, Dextrose and Sodium Chloride 
1 5 Injection, Lactated Ringer's Injection, or other vehicle as known in the art. The 
pharmaceutical composition of the present invention may also contain stabilizers, 
preservatives, buffers, antioxidants, or other additives known to those of skill in the art. For 
injection, the agents of the invention maybe formulated in aqueous solutions, preferably in 
physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in 
the art. 

For oral administration, the compounds can be formulated readily by combining the ' 
active compounds with pharmaceutically acceptable carriers well known in the art. Such 
carriers enable the compounds of the invention to be formulated as tablets, pills, dragees, 
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a 
patient to be treated. Pharmaceutical preparations for oral use can be obtained from a solid 
excipient, optionally grinding a resulting mixture, and processing the mixture of granules, 
after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 
excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or 
sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, 
potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, 
sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, 
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disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or 
alginic acid or a salt thereof such as sodium alginate. Dragee cores are provided with 
suitable coatings. For this purpose, concentrated sugar solutions may be used, which may 
optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, 
and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
Dyestuffs or pigments may be added to the tablets or dragee coatings for identification or to 
characterize different combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made 
of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol 
or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler 
such as lactose, binders such as starches, and/or lubricants such as talc or magnesium 
stearate and, optionally, stabilizers. In soft capsules, the active compounds may be dissolved 
or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene 
glycols. In addition, stabilizers may be added. All formulations for oral administration 
should be in dosages suitable for such administration. For buccal administration, the 
compositions may take the form of tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide 
or other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined 
by providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin 
for use in an inhaler or insufflator may be formulated containing a powder mix of the 
compound and a suitable powder base such as lactose or starch. The compounds may be 
formulated for parenteral administration by injection, e.g 9 by bolus injection or continuous 
infusion. Formulations for injection may be presented in unit dosage form, e.g., in ampules 
or in multi-dose containers, with an added preservative. The compositions may take such 
forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain 
formulatory agents such as suspending, stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions 
of the active compounds in water-soluble form. Additionally, suspensions of the active 
compounds may be prepared as appropriate oily injection suspensions. Suitable lipophilic 
solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such 
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as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, such as sodium carboxymethyl 
cellulose, sorbitol, or dextran. Optionally, the suspension may also contain suitable 
stabilizers or agents which increase the solubility of the compounds to allow for the 
5 preparation of highly concentrated solutions. Alternatively, the active ingredient may be in 
powder form for constitution with a suitable vehicle, eg., sterile pyrogen-free water, before 
use. 

The compounds may also be formulated in rectal compositions such as suppositories 
or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or 

10 other glycerides. In addition to the formulations described previously, the compounds may 
also be formulated as a depot preparation. Such long acting formulations may be 
administered by implantation (for example subcutaneously or intramuscularly) or by 
intramuscular injection. Thus, for example, the compounds may be formulated with suitable 
polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion 

15 exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- 
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic 
polymer, and an aqueous phase. The co-solvent system may be the VPD co-solvent system. 
VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 

20 80, and 65% w/v polyethylene glycol 300, made up to volume in absolute ethanol. The VPD 
co-solvent system (VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water 
solution. This co-solvent system dissolves hydrophobic compounds well, and itself produces 
low toxicity upon systemic administration. Naturally, the proportions of a co-solvent system 
may be varied considerably without destroying its solubility and toxicity characteristics. 

25 Furthermore, the identity of the co-solvent components may be varied: for example, other 
low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the fraction size of 
polyethylene glycol may be varied; other biocompatible polymers may replace polyethylene 
glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for 
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical compounds 

30 may be employed. Liposomes and emulsions are well known examples of delivery vehicles 
or carriers for hydrophobic drugs. Certain organic solvents such as dimethylsulfoxide also 
may be employed, although usually at the cost of greater toxicity. Additionally, the 
compounds may be delivered using a sustained-release system, such as semipermeable 
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matrices of solid hydrophobic polymers containing the therapeutic agent. Various types of 
sustained-release materials have been established and are well known by those skilled in the 
•art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
5 biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited to 
calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 

10 gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the 
invention may be provided as salts with pharmaceutically compatible counter ions. Such 
pharmaceutically acceptable base addition salts are those salts which retain the biological 
effectiveness and properties of the free acids and which are obtained by reaction with 
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, 

15 trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, 
potassium benzoate, Methanol amine and the like. 

The pharmaceutical composition of the invention may be in the form of a complex of 
the protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 

20 lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) 
following presentation of the antigen by MHC proteins. MHC and structurally related 
proteins including those encoded by class I and class II MHC genes on host cells will serve 
to present the peptide antigen(s) to T lymphocytes. The antigen components could also be 

25 supplied as purified MHC-peptide complexes alone or with co-stimulatory molecules that 
can directly signal T cells. Alternatively antibodies able to bind surface immunoglobulin 
and other molecules on B cells as well as antibodies able to bind the TCR and other 
molecules on T cells can be combined with the pharmaceutical composition of the invention. 
The pharmaceutical composition of the invention may be in the form of a liposome in 

30 which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. 
Suitable lipids for liposomal formulation include, without limitation, monoglycerides, 
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diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. 
Preparation of such liposomal formulations is within the level of skill in the art, as disclosed, 
for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of 
which are incorporated herein by reference. 
5 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and 
severity of the condition being treated, and on the nature of prior treatments which the 
patient has undergone. Ultimately, the attending physician will decide the amount of protein 
or other active ingredient of the present invention with which to treat each individual patient. 

10 Initially, the attending physician will administer low doses of protein or other active 
ingredient of the present invention and observe the patient's response. Larger doses of 
protein or other active ingredient of the present invention may be administered until the 
optimal therapeutic effect is obtained for the patient, and at that point the dosage is not 
increased further. It is contemplated that the various pharmaceutical compositions used to 

15 practice the method of the present invention should contain about 0.01 \xg to about 100 mg 
(preferably about 0.1 ng to about 10 mg, more preferably about 0.1 \ig to about 1 mg) of 
protein or other active ingredient of the present invention per kg body weight. For 
compositions of the present invention which are useful for bone, cartilage, tendon or 
ligament regeneration, the therapeutic method includes administering the composition 

20 topically, systematically, or locally as an implant or device. When administered, the 
therapeutic composition for use in this invention is, of course, in a pyrogen-free, 
physiologically acceptable form. Further, the composition may desirably be encapsulated or 
injected in a viscous form for delivery to the site of bone, cartilage or tissue damage. 
Topical administration may be suitable for wound healing and tissue repair. Therapeutically 

25 useful agents other than a protein or other active ingredient of the invention which may also 
optionally be included in the composition as described above, may alternatively or 
additionally, be administered simultaneously or sequentially with the composition in the 
methods of the invention. Preferably for bone and/or cartilage formation, the composition 
would include a matrix capable of delivering the protein-containing or other active 

30 ingredient-containing composition to the site of bone and/or cartilage damage, providing a 
structure for the developing bone and cartilage and optimally capable of being resorbed into 
the body. Such matrices may be formed of materials presently in use for other implanted 
medical applications. 
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The choice of matrix material is based on Incompatibility, biodegradability, 
mechanical properties, cosmetic appearance and interface properties. The particular 
application of the compositions will define the appropriate formulation. Potential matrices 
for the compositions may be biodegradable and chemically defined calcium sulfate, 
5 tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. 
Other potential materials are biodegradable and biologically well-defined, such as bone or 
dermal collagen. Further matrices are comprised of pure proteins or extracellular matrix 
components. Other potential matrices are nonbiodegradable and chemically defined, such as 
sintered hydroxyapatite, bioglass, aluminates, or other ceramics. Matrices may be comprised 

10 of combinations of any of the above-mentioned types of material, such as polylactic acid and 
hydroxyapatite or collagen and tricalcium phosphate. The bioceramics may be altered in 
composition, such as in calcium-aluminate-phosphate and processing to alter pore size, 
particle size, particle shape, and biodegradability. Presently preferred is a 50:50 (mole 
weight) copolymer of lactic acid and glycolic acid in the form of porous particles having 

15 diameters ranging from 150 to 800 microns. In some applications, it will be useful to utilize 
a sequestering agent, such as carboxymethyl cellulose or autologous blood clot, to prevent 
the protein compositions from disassociating from the matrix. 

A preferred family of sequestering agents is cellulosic materials such as 
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 

20 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, 

hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being 
cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents 
include hyaluronic acid, sodium alginate, polyethylene glycol), polyoxyethylene oxide, 
carboxyvinyl polymer and poly(vinyl alcohol). The amount of sequestering agent useful 

25 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which 
represents the amount necessary to prevent desorption of the protein from the polymer 
matrix and to provide appropriate handling of the composition, yet not so much that the 
progenitor cells are prevented from infiltrating the matrix, thereby providing the protein the 
opportunity to assist the osteogenic activity of the progenitor cells. In further compositions, 

30 proteins or other active ingredients of the invention may be combined with other agents 
beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in question. 
These agents include various growth factors such as epidermal growth factor (EGF), platelet 



WO 03/080795 



PCT/US02/25485 



81 

derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-p), and 
insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
5 patients for such treatment with proteins or other active ingredients of the present invention. 
The dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site 
of damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue 

10 (e.g., bone), the patient's age, sex, and diet, the severity of any infection, time of 

administration and other clinical factors. The dosage may vary with the type of matrix used 
in the reconstitution and with inclusion of other proteins in the pharmaceutical composition. 
For example, the addition of other known growth factors, such as IGF I (insulin like growth 
factor I), to the final composition, may also effect the dosage- Progress can be monitored by 

15 periodic assessment of tissue/bone growth and/or repair, for example, X-rays, 
histomorphometric determinations and tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other 

20 known methods for introduction of nucleic acid into a cell or organism (including, without 
limitation, in the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in 
the presence of proteins of the present invention in order to proliferate or to produce a 
desired effect on or activity in such cells. Treated cells can then be introduced in vivo for 
therapeutic purposes. 

25 

4,12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve 
its intended purpose. More specifically, a therapeutically effective amount means an amount 
30 effective to prevent development of or to alleviate the existing symptoms of the subject 
being treated. Determination of the effective amount is well within the capability of those 
skilled in the art, especially in light of the detailed disclosure provided herein. For any 
compound used in the method of the invention, the therapeutically effective dose can be 
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estimated initially from appropriate in vitro assays. For example, a dose can be formulated in 
animal models to achieve a circulating concentration range that can be used to more 
accurately determine useful doses in humans. For example, a dose can be formulated in 
animal models to achieve a circulating concentration range that includes the IC50 as 
5 determined in cell culture (i.e., the concentration of the test compound which achieves a 
half-maximal inhibition of the protein's biological activity). Such information can be used 
to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 

1 0 efficacy of such compounds can be determined by standard pharmaceutical procedures in 
cell cultures or experimental animals, e.g., for determining the LD 50 (the dose lethal to 50% 
of the population) and the ED 50 (the dose therapeutically effective in 50% of the population). 
The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be 
expressed as the ratio between LD 50 and ED50. Compounds which exhibit high therapeutic 

15 indices are preferred. The data obtained from these cell culture assays and animal studies 
can be used in formulating a range of dosage for use in human. The dosage of such 
compounds lies preferably within a range of circulating concentrations that include the ED50 
with little or no toxicity. The dosage may vary within this range depending upon the dosage 
form employed and the route of administration utilized. The exact formulation, route of 

20 administration and dosage can be chosen by the individual physician in view of the patient's 
condition. See, e.g., Fingl et al., 1975, in "The Pharmacological Basis of Therapeutics", Ch. 
1 p.l. Dosage amount and interval may be adjusted individually to provide plasma levels of 
the active moiety which are sufficient to maintain the desired effects, or minimal effective 
concentration (MEC). The MEC will vary for each compound but can be estimated from in 

25 vitro data. Dosages necessary to achieve the MEC will depend on individual characteristics 
and route of administration. However, HPLC assays or bioassays can be used to determine 
plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of 

30 the time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
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•An exemplary dosage regimen for polypeptides or other compositions of the 
invention will be in the range of about 0.01 jig/kg to 100 mg/kg of body weight daily, with 
the preferred dose being about 0.1 |ig/kg to 25 mg/kg of patient body weight daily, varying 
in adults and children. Dosing may be once daily, or equivalent doses may be delivered at 
5 longer or shorter intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 



10 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which 
may contain one or more unit dosage forms containing the active ingredient. The pack may, 
for example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser 
device may be accompanied by instructions for administration. Compositions comprising a 
1 5 compound of the invention formulated in a compatible pharmaceutical carrier may also be 
prepared, placed in an appropriate container, and labeled for treatment of an indicated 
condition. 

4.13 ANTIBODIES 

20 Also included in the invention are antibodies to proteins, or fragments of proteins of 

the invention. The term "antibody 11 as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that 
contain an antigen-binding site that specifically binds (immunoreacts with) an antigen. Such 
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, 

25 Fab, Fab 1 and V m2 fragments, and an F a b expression library. In general, an antibody molecule 
obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ 
from one another by the nature of the heavy chain present in the molecule. Certain classes 
have subclasses as well, such as IgGi, IgG2, and others. Furthermore, in humans, the light 
chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a 

30 reference to all such classes, subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or 
a portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for 
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polyclonal and monoclonal antibody preparation. The full-length protein can be used or, 
alternatively, the invention provides antigenic peptide fragments of the antigen for use as 
immunogens. An antigenic peptide fragment comprises at least 6 amino acid residues of the 
amino acid sequence of the full length protein, such as an amino acid sequence shown in 

5 SEQ ID NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8, and encompasses an epitope 
thereof such that an antibody raised against the peptide forms a specific immune complex 
with the full length protein or with any fragment that contains the epitope. Preferably, the 
antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino acid 
residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 

10 epitopes encompassed by the antigenic peptide are regions of the protein that are located on 
its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a surface region of the protein, e.g., a hydrophilic region. A 
hydrophobicity analysis of the human related protein sequence will indicate which regions of 

15 a related protein are particularly hydrophilic and, therefore, are likely to encode surface 
residues useful for targeting antibody production. As a means for targeting antibody 
production, hydropathy plots showing regions of hydrophilicity and hydrophobicity may be 
generated by any method well known in the art, including, for example, the Kyte Doolittle or 
the Hopp Woods methods, either with or without Fourier transformation. See, e.g., Hopp and 

20 Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. Mol. 
Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or 
derivatives, fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

25 thereof, may be utilized as an immunogen in the generation of antibodies' that 
immunospecifically bind these protein components. 

The term "specific for" indicates that the variable regions of the antibodies of the 
invention recognize and bind polypeptides of the invention exclusively (i.e., able to 
distinguish the polypeptide of the invention from other similar polypeptides despite sequence 

30 identity, homology, or similarity found in the family of polypeptides), but may also interact 
with other proteins (for example, S. aureus protein A or other antibodies in ELISA 
techniques) through interactions with sequences outside the variable region of the antibodies, 
and in particular, in the constant region of the molecule. Screening assays to determine 
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binding specificity of an antibody of the invention are well known and routinely practiced in 
the art. For a comprehensive discussion of such assays, see Harlow et al. (Eds), Antibodies 
A Laboratory Manual; Cold Spring Harbor Laboratory; Cold Spring Harbor, NY (1988), 
Chapter 6. Antibodies that recognize and bind fragments of the polypeptides of the 
5 invention are also contemplated, provided that the antibodies are first and foremost specific 
for, as defined above, full-length polypeptides of the invention. As with antibodies that are 
specific for full length polypeptides of the invention, antibodies of the invention that 
recognize fragments are those which can distinguish polypeptides from the same family of 
polypeptides despite inherent sequence identity, homology, or similarity found in the family 
10 of proteins. 

Antibodies of the invention are useful for, for example, therapeutic purposes (by 
modulating activity of a polypeptide of the invention), diagnostic purposes to detect or 
quantitate a polypeptide of the invention, as well as purification of a polypeptide of the 
invention. Kits comprising an antibody of the invention for any of the purposes described 

15 herein are also comprehended. In general, a kit of the invention also includes a control 
antigen for which the antibody is immunospecific. The invention further provides a 
hybridoma that produces an antibody according to the invention. Antibodies of the 
invention are useful for detection and/or purification of the polypeptides of the invention. 
Monoclonal antibodies binding to the protein of the invention may be useful 

20 diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal 
antibodies binding to the protein may also be useful therapeutics for both conditions 
associated with the protein and also in the treatment of some forms of cancer where 
abnormal expression of the protein is involved. In the case of cancerous cells or leukemic 
cells, neutralizing monoclonal antibodies against the protein may be useful in detecting and 

25 preventing the metastatic spread of the cancerous cells, which may be mediated by the 
protein. 

The labeled antibodies of the present invention can be used for in vitro, in vivo, and 
in situ assays to identify cells or tissues in which a fragment of the polypeptide of interest is 
expressed. The antibodies may also be used directly in therapies or other diagnostics. The 
30 present invention further provides the above-described antibodies immobilized on a solid 
support. Examples of such solid supports include plastics such as polycarbonate, complex 
carbohydrates such as agarose and Sepharose®, acrylic resins and such as polyacrylamide 
and latex beads. Techniques for coupling antibodies to such solid supports are well known 
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in the art (Weir, D.M. et al., "Handbook of Experimental Immunology" 4th Ed., Blackwell 
Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W.D. et al., Meth. 
Enzym. 34 Academic Press, N. Y. (1974)). The immobilized antibodies of the present 
invention can be used for in vitro, in vivo, and in situ assays as well as for immuno-affinity 
5 purification of the proteins of the present invention. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: 
A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, 
10 Cold Spring Harbor, NY, incorporated herein by reference). Some of these antibodies are 
discussed below. 



20 



4.13.1 POLYCLONAL ANTIBODIES 

For the production of polyclonal antibodies, various suitable host animals (e.g., 
15 rabbit, goat, mouse or other mammal) may be immunized by one or more injections with the 
native protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated 
to a second protein known to be immunogenic in the mammal being immunized. Examples 
of such immunogenic proteins include but are not limited' to keyhole limpet hemocyanin, 
serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can 
further include an adjuvant. Various adjuvants used to increase the immunological response 
include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., 
25 aluminum hydroxide), surface-active substances (e.g., lysolecithin, pluronic polyols, 

polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in humans such as 
Bacille Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory 
agents. Additional examples of adjuvants that can be employed include MPL-TDM adjuvant 
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). 
*0 The polyclonal antibody molecules directed against the immunogenic protein can be 

isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 
primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
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antigen which is the target of fe-ttimruii^^^iin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoaffinity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 
5 (April 17, 2000), pp. 25-28). 



4.13.2 MONOCLONAL ANTIBODIES 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as 
used herein, refers to a population of antibody molecules that contain only one molecular 

10 species of antibody molecule consisting of a unique light chain gene product and a unique 
heavy chain gene product. In particular, the complementarity determining regions (CDRs) 
of the monoclonal antibody are identical in all the molecules of the population. MAbs thus 
contain an antigen-binding site capable of immunoreacting with a particular epitope of the 
antigen characterized by a unique binding affinity for it. 

15 Monoclonal antibodies can be prepared using hybridoma methods, such as those 

described by Kohler and Milstein, Nature, 256, 495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies 
that will specifically bind to the immunizing agent. Alternatively, the lymphocytes can be 

20 immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells 
of human origin are desired, or spleen cells or lymph node cells are used if non-human 
mammalian sources are desired. The lymphocytes are then fused with an immortalized cell 

25 line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell 
(Goding, Monoclonal Antibodies: Principles and Practice, Academic Press, (1986) pp. 59- 
103). Immortalized cell lines are usually transformed mammalian cells, particularly 
myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell 
lines are employed. The hybridoma cells can be cultured in a suitable culture medium that 

30 preferably contains one or more substances that inhibit the growth or survival of the unfused, 
immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine guanine 
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas 
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typically will include hypoxanthine, aminopterin, and thymidine ("HAT medium"), which 
substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high 
level expression of antibody by the selected antibody-producing cells, and are sensitive to a 
5 medium, such as HAT medium. More preferred immortalized cell lines are murine myeloma 
lines, which can be obtained, for instance, from the Salk Institute Cell Distribution Center, 
San Diego, California and the American Type Culture Collection, Manassas, Virginia. 
Human myeloma and mouse-human heteromyeloma cell lines also have been described for 
the production of human monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); 

10 Brodeur et al., Monoclonal Antibody Production Techniques and Applications, Marcel 
Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed 
for the presence of monoclonal antibodies directed against the antigen. Preferably, the 
binding specificity of monoclonal antibodies produced by the hybridoma cells is determined 

15 by immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELIS A). Such techniques and assays are known in 
the art. The binding affinity of the monoclonal antibody can, for example, be determined by 
the Scatchard analysis of Munson and Pollard, Anal. Biochem., 107, 220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target 

20 antigen are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for this 
purpose include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 
medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

25 The monoclonal antibodies secreted by the subclones can be isolated or purified from 

the culture medium or ascites fluid by conventional immunoglobulin purification procedures 
such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel 
electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 

30 those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of 
the invention can be readily isolated and sequenced using conventional procedures (e.g., by 
using oligonucleotide probes that are capable of binding specifically to genes encoding the 
heavy and light chains of murine antibodies). The hybridoma cells of the invention serve as 
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a preferred source of such DNA. Once isolated, the DNA can be placed into expression 
vectors, which are then transfected into host cells such as simian COS cells, Chinese hamster 
ovary (CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, 
to obtain the synthesis of monoclonal antibodies in the recombinant host cells. The DNA 
5 also can be modified, for example, by substituting the coding sequence for human heavy and 
light chain constant domains in place of the homologous murine sequences (U.S. Patent No. 
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to the 
immunoglobulin coding sequence all or part of the coding sequence for a non- 
immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be substituted 
10 for the constant domains of an antibody of the invention, or can be substituted for the 

variable domains of one antigen-combining site of an antibody of the invention to create a 
chimeric bivalent antibody. 

4.13.3 HUMANIZED ANTIBODIES 

1 5 The antibodies directed against the protein antigens of the invention can further 

comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 
the administered immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , 

20 F(ab f )2 or other antigen-binding subsequences of antibodies) that are principally comprised 
of the sequence of a human immunoglobulin, and contain minimal sequence derived from a 
non-human immunoglobulin. Humanization can be performed following the method of 
Winter and co-workers (Jones et al., Nature, 321, 522-525 (1986); Riechmann et al., Nature, 
332, 323-327 (1988); Verhoeyen et al., Science, 239, 1534-1536 (1988)), by substituting 

25 rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. (See 
also U.S. Patent No. 5,225,539). In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
can also comprise residues that are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, the humanized antibody will comprise 

30 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework regions are those of a human immunoglobulin 
consensus sequence. The humanized antibody optimally also will comprise at least a portion 
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of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin 
(Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol, 2, 593-596 
(1992)). 

5 4.13.4 HUMAN ANTIBODIES 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from 
human genes. Such antibodies are termed "human antibodies", or "fully human antibodies" 
herein. Human monoclonal antibodies can be prepared by the trioma technique; the human 

10 B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV 
hybridoma technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human 
monoclonal antibodies may be utilized in the practice of the present invention and may be 
produced by using human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80, 

15 2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et 
al., 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

£ 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227, 381 (1991); 
Marks et al, J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 

20 introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 

25 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779- 
783 (1992)); Lonberg et al. (Nature 368, 856-859 (1994)); Morrison (Nature 368, 812-13 

(1994) ); Fishwild et al, (Nature Biotechnology 14, 845-51 (1996)); Neuberger (Nature 
Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol. 13, 65-93 

(1995) ). 

30 Human antibodies may additionally be produced using transgenic nonhuman animals 

that are modified so as to produce fully human antibodies rather than the animal's 
endogenous antibodies in response to challenge by an antigen. (See PCT publication 
WO94/02602). The endogenous genes encoding the heavy and light immunoglobulin chains 
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in the nonhuman host have been incapacitated, and active loci encoding human heavy and 
light chain immunoglobulins are inserted into the host's genome. The human genes are 
incorporated, for example, using yeast artificial chromosomes containing the requisite 
human DNA segments. An animal which provides all the desired modifications is then 
5 obtained as progeny by crossbreeding intermediate transgenic animals containing fewer than 
the full complement of the modifications. The preferred embodiment of such a nonhuman 
animal is a mouse, and is termed the Xenomouse™ as disclosed in PCT publications WO 
96/33735 and WO 96/34096. This animal produces B cells that secrete fully human 
immunoglobulins. The antibodies can be obtained directly from the animal after 

1 0 immunization with an immunogen of interest, as, for example, a preparation of a polyclonal 
antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 

1 5 example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, 
lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment genes 
from at least one endogenous heavy chain locus in an embryonic stem cell to prevent 

20 rearrangement of the locus and to prevent formation of a transcript of a rearranged 
immunoglobulin heavy chain locus, the deletion being effected by a targeting vector 
containing a gene encoding a selectable marker; and producing from the embryonic stem cell 
a transgenic mouse whose somatic and germ cells contain the gene encoding the selectable 
marker. 

25 A method for producing an antibody of interest, such as a human antibody, is 

disclosed in U.S. Patent No. 5,916,771 . It includes introducing an expression vector that 
contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 
culture, introducing an expression vector containing a nucleotide sequence encoding a light 
chain into another mammalian host cell, and fusing the two cells to form a hybrid cell. The 

30 hybrid cell expresses an antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically . 
relevant epitope on an immunogen, and a correlative method for selecting an antibody that 
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binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCT 
publication WO 99/53049. 

4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES 

5 According to the invention, techniques can be adapted for the production of 

single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent 
No. 4,946,778). In addition, methods can be adapted for the construction of F a b expression 
libraries (see e.g., Huse, et al., 1989 Science 246, 1275-1281) to allow rapid and effective 
identification of monoclonal F a b fragments with the desired specificity for a protein or 

10 derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the 
idiotypes to a protein antigen may be produced by techniques known in the art including, but 
not limited to: (i) an F( a b')2 fragment produced by pepsin digestion of an antibody molecule; 
(ii) an F a b fragment generated by reducing the disulfide bridges of an F( a b*)2 fragment; (iii) an 
Fab fragment generated by the treatment of the antibody molecule with papain and a reducing 

1 5 agent and (iv) F v fragments. 

4.13.6 BISPECIFIC ANTIBODIES 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
that have binding specificities for at least two different antigens. In the present case, one of 
20 the binding specificities is for an antigenic protein of the invention. The second binding 
target is any other antigen, and advantageously is a cell-surface protein or receptor or 
receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 

25 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305, 537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) 
produce a potential mixture of ten different antibody molecules, of which only one has the 
correct bispecific structure. The purification of the correct molecule is usually accomplished 

30 by affinity chromatography steps. Similar procedures are disclosed in WO 93/08829, 
published 13 May 1993, and in Traunecker et al., 1991 EMBO J., 10, 3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part 

of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant 

region (CHI) containing the site necessary for light-chain binding present in at least one of 

the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the 

5 immunoglobulin light chain, are inserted into separate expression vectors, and are co- 

/ ..." 
transfected into a suitable host organism. For further details of generating bispecific 

antibodies see, for example, Suresh et al., Methods in Enzymology, 121, 210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a 

pair of antibody molecules can be engineered to maximize the percentage of heterodimers 

10 that are recovered from recombinant cell culture. The preferred interface comprises at least 
a part of the CH3 region of an antibody constant domain. In this method, one or more small 
amino acid side chains from the interface of the first antibody molecule are replaced with 
larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical or 
similar size to the large side chain(s) are created on the interface of the second antibody 

15 molecule by replacing large amino acid side chains with smaller ones (e.g. alanine or 

threonine). This provides a mechanism for increasing the yield of the heterodimer over other 
unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full-length antibodies or antibody fragments 
(e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from 

20 antibody fragments have been described in the literature. For example, bispecific antibodies 
can be prepared using chemical linkage. Brennan et al., Science 229, 81 (1985) describe a 
procedure wherein intact antibodies are proteolytically cleaved to generate F(ab') 2 
fragments. These fragments are reduced in the presence of the dithiol complexing agent 
sodium arsenite to stabilize vicinal dithiols and prevent intermolecular disulfide formation. 

25 The Fab' fragments generated are then converted to thionitrobenzoate (TNB) derivatives. 
One of the FaV-TNB derivatives is then reconverted to the Fab'-thiol by reduction with 
mercaptoethylamine and is mixed with an equimolar amount of the other Fab' -TNB 
derivative to form the bispecific antibody. The bispecific antibodies produced can be used 
as agents for the selective immobilization of enzymes. 

30 Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med i 175, 217-225 (1992) 
describe the production of a fully humanized bispecific antibody F(ab')2 molecule. Each 
Fab' fragment was separately secreted from E. coli and subjected to directed chemical 
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coupling in vitro to form the bispecific antibody. The bispecific antibody thus formed was 
able to bind to cells overexpressing the ErbB2 receptor and normal human T cells, as well as 
trigger the lytic activity of human cytotoxic lymphocytes against human breast tumor targets. 
Various techniques for making and isolating bispecific antibody fragments directly 
5 from recombinant cell culture have also been described. For example, bispecific antibodies 
have been produced using leucine zippers. Kostelny et al., J. Immunol. 148(5), 1547-1553 
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' 
portions of two different antibodies by gene fusion. The antibody homodimers were reduced 
at the hinge region to form monomers and then re-oxidized to form the antibody 

10 heterodimers. This method can also be utilized for the production of antibody homodimers. 
The "diabody" technology described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90, 
6444-6448 (1993) has provided an alternative mechanism for making bispecific antibody 
fragments. The fragments comprise a heavy-chain variable domain (V H ) connected to a 
light-chain variable domain (Vl) by a linker which is too short to allow pairing between the 

1 5 two domains on the same chain. Accordingly, the Vh and Vl domains of one fragment are 
forced to pair with the complementary V L and Vh domains of another fragment, thereby 
forming two antigen-binding sites. Another strategy for making bispecific antibody 
fragments by the use of single-chain Fv (sFv) dimers has also been reported. See, Gruber et 
al., J. Immunol. 152, 5368 (1994). 

20 Antibodies with more than two valencies are contemplated. For example, trispecific 

antibodies can be prepared. Tutt et al., J. Immunol. 147, 60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm 
of an immunoglobulin molecule can be combined with an arm which binds to a triggering 

25 molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), 
or Fc receptors for IgG (FcyR), such as FcyRI (CD64), FcyRH (CD32) and FcyRIII (CD16) 
so as to focus cellular defense mechanisms to the cell expressing the particular antigen. 
Bispecific antibodies can also be used to direct cytotoxic agents to cells which express a 
particular antigea These antibodies possess an antigen-binding arm and an arm which binds 

30 a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOTA, or TETA. 
Another bispecific antibody of interest binds the protein antigen described herein and further 
binds tissue factor (TF). 
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4.13.7 HETEROCONJUGATE ANTIBODIES 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted cells 
5 (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 
92/200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro using 
known methods in synthetic protein chemistry, including those involving crosslinking 
agents. For example, immunotoxins can be constructed using a disulfide exchange reaction 
or by forming a thioether bond. Examples of suitable reagents for this purpose include 
10 iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S. 
Patent No. 4,676,980. 



4.13.8 EFFECTOR FUNCTION ENGINEERING 

It can be desirable to modify the antibody of the invention with respect to effector 
15 function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
generated can have improved internalization capability and/or increased complement- 
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron et 
20 al, J. Exp Med., 176, 1191-1195 (1992) and Shopes, J. Immunol., 148, 2918-2922 (1992). 
Homodimeric antibodies with enhanced anti-tumor activity can also be prepared using 
heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53, 2560- 
2565 (1993). Alternatively, an antibody can be engineered that has dual Fc regions and can 
thereby have enhanced complement lysis and ADCC capabilities. See Stevenson et al., 
25 Anti-Cancer Drug Design, 3, 219-230 (1989). 



4.13.9 IMMUNOCONJUGATES 

The invention also pertains to immunoconjugates comprising an antibody conjugated 
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active 
30 toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive 
isotope (i.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 
been described above. Enzymatically active toxins and fragments thereof that can be used 
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include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A 
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, 
alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins 
(PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria 
5 officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the 

tricothecenes. A variety of radionuclides are available for the production of radioconjugated 
antibodies. Examples include 2l2 Bi, 131 I, l3 V ™Y 9 and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
bifunctional protein-coupling agents such as N-succinimidyl--3-(2-pyridyldithiol) propionate 

10 (SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl 
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes (such as 
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) hexanediamine), bis- 
diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenedianiine), diisocyanates 
(such as tolyene 2,6-diisocyanate) 5 and bis-active fluorine compounds (such as 1,5-difluoro- 

1 5 2,4-dinitrobenzene). For example, a ricin immunotoxjn can be prepared as described in 
Vitetta et al., Science, 238: 1098 (1987). Carbon- 14-labeled l-isothiocyanatobenzyl-3- 
methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating agent for 
conjugation of radionucleotide to the antibody. See W094/1 1026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 

20 streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

25 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention 
can be recorded on computer readable media. As used herein, "computer readable media" 
refers to any medium which can be read and accessed directly by a computer. Such media 
30 include, but are not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical 
storage media such as RAM and ROM; and hybrids of these categories such as 
magnetic/optical storage media. A skilled artisan can readily appreciate how any of the 
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presently known computer readable mediums can be used to create a manufacture 
comprising computer readable medium having recorded thereon a nucleotide sequence of the 
present invention. As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently known 
5 methods for recording information on computer readable medium to generate manufactures 
comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means 

10 chosen to access the stored information. In addition, a variety of data processor programs 
and formats can be used to store the nucleotide sequence information of the present 
invention on computer readable medium. The sequence information can be represented in a 
word processing text file, formatted in commercially-available software such as WordPerfect 
and Microsoft Word, or represented in the form of an ASCII file, stored in a database 

15 application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any 
number of data processor structuring formats (e.g. text file or database) in order to obtain 
computer readable medium having recorded thereon the nucleotide sequence information of 
the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534 or 
20 a representative fragment thereof; or a nucleotide sequence at least 95% identical to any of 
the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534 in computer readable form, a 
skilled artisan can routinely access the sequence information for a variety of purposes. 
Computer software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which follow 
25 demonstrate how software which implements the BLAST (Altschul et al., J. Mol. Biol. 

215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search 
algorithms on a Sybase system is used to identify open reading frames (ORFs) within a 
nucleic acid sequence. Such ORFs may be protein-encoding fragments and may be useful in 
producing commercially important proteins such as enzymes used in fermentation reactions 
30 and in the production of commercially useftd metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the 
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present invention comprises a central processing unit (CPU), input means, output means, and 
data storage means. A skilled artisan can readily appreciate that any one of the currently 
available computer-based systems are suitable for use in the present invention. As stated 
above, the computer-based systems of the present invention comprise a data storage means 
5 having stored therein a nucleotide sequence of the present invention and the necessary 
hardware means and software means for supporting and implementing a search means. As 
used herein, "data storage means" refers to memory which can store nucleotide sequence 
information of the present invention, or a memory access means which can access 
manufactures having recorded thereon the nucleotide sequence information of the present 
10 invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target structural 
motif with the sequence information stored within the data storage means. Search means are 
used to identify fragments or regions of a known sequence which match a particular target 

15 sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the 
computer-based systems of the present invention. Examples of such software includes, but 
is not limited to, Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA 
(NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any one of the available 

20 algorithms or implementing software packages for conducting homology searches can be 
adapted for use in the present computer-based systems. As used herein, a "target sequence" 
can be any nucleic acid or amino acid sequence of six or more nucleotides or two or more 
amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the 
less likely a target sequence will be present as a random occurrence in the database. The 

25 most preferred sequence length of a target sequence is from about 10 to 300 amino acids, 
more preferably from about 30 to 100 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in 
gene expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 

30 selected sequence or combination of sequences in which the sequence(s) are chosen based on 
a three-dimensional configuration which is formed upon the folding of the target motif 
There are a variety of target motifs known in the art. Protein target motifs include, but are 
not limited to, enzyme active sites and signal sequences. Nucleic acid target motifs include, 
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but are not limited to, promoter sequences, hairpin structures and inducible expression 
elements (protein binding sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used 
to control gene expression through triple helix formation or antisense DNA or RNA, both of 
which methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and 
are designed to be complementary to a region of the gene involved in transcription (triple 
helix-see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 15241, 456 
(1988); and Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense- 
Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of 
Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization 
blocks translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the sequences of 
the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression 
of one of the ORFs of the present invention, or homolog thereof, in a test sample, using a 
nucleic acid probe or antibodies of the present invention, optionally conjugated or otherwise 
associated with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 
polynucleotide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polynucleotide of the invention is detected in the sample. 
Such methods can also comprise contacting a sample under stringent hybridization 
conditions with nucleic acid primers that anneal to a polynucleotide of the invention under 
such conditions, and amplifying annealed polynucleotides, so that if a polynucleotide is 
amplified, a polynucleotide of the invention is detected in the sample. 
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In general, methods for detecting a polypeptide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 
polypeptide for a period sufficient to form the complex, and detecting the complex, so that if 
a complex is detected, a polypeptide of the invention is detected in the sample. 
5 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying 
for binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

10 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. 
One skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic 
acid probes or antibodies of the present invention. Examples of such assays can be found in 
Chard, T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science 

15 Publishers, Amsterdam, The Netherlands (1986); Bullock, G.R. et al, Techniques in 

Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 
(1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory Techniques in 
Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The 
Netherlands (1985). The test samples of the present invention include cells, protein or 

20 membrane extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or 
urine. The test sample used in the above-described method will vary based on the assay 
format, nature of the detection method and the tissues, cells or extracts used as the sample to 
be assayed. Methods for preparing protein extracts or membrane extracts of cells are well 
known in the art and can be readily be adapted in order to obtain a sample which is 

25 compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the 
invention provides a compartment kit to receive, in close confinement, one or more 
containers which comprises: (a) a first container comprising one of the probes or antibodies 

30 of the present invention; and (b) one or more other containers comprising one or more of the 
following: wash reagents, reagents capable of detecting presence of a bound probe or 
antibody. 



WO 03/080795 



PCT/US02/25485 



101 

In detail, a compartment kit includes any kit in which reagents are contained in 
separate containers. Such containers include small glass containers, plastic containers or 
strips of plastic or paper. Such containers allows one to efficiently transfer reagents from 
one compartment to another compartment such that the samples and reagents are not 
5 cross-contaminated, and the agents or solutions of each container can be added in a 
quantitative fashion from one compartment to another. Such containers will include a 
container which will accept the test sample, a container which contains the antibodies used 
in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffers, etc.), and containers which contain the reagents used to detect the bound 

10 antibody or probe. Types of detection reagents include labeled nucleic acid probes, labeled 
secondary antibodies, or in the alternative, if the primary antibody is labeled, the enzymatic, 
or antibody binding reagents which are capable of reacting with the labeled antibody. One 
skilled in the art will readily recognize that the disclosed probes and antibodies of the present 
invention can be readily incorporated into one of the established kit formats which are well 

15 known in the art. 



4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
20 invention is involved in the immune response, for imaging sites of inflammation or 
infection). See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve 
chemical attachment of a labeling or imaging agent, administration of the labeled 
polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging the labeled 
polypeptide in vivo at the target site. 

25 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present 
invention further provides methods of obtaining and identifying agents which bind to a 
polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set forth 
30 in SEQ ID NO: 1-1041, or 2083-2534, or bind to a specific domain of the polypeptide 
encoded by the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the 
present invention, or nucleic acid of the invention; and 
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(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide 
of the invention for a time sufficient to form a polynucleotide/compound complex, and 
5 detecting the complex, so that if a polynucleotide/compound complex is detected, a 
compound that binds to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to 
a polypeptide of the invention can comprise contacting a compound with a polypeptide of 
the invention for a time sufficient to form a polypeptide/compound complex, and detecting 
10 the complex, so that if a polypeptide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can 
also comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression 
15 of a receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound 
that binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
20 activity observed in the absence of the compound). Alternatively, compounds identified via 
such methods can include compounds which modulate the expression of a polynucleotide of 
the invention (that is, increase or decrease expression relative to expression levels observed 
in the absence of the compound). Compounds, such as compounds identified via the 
methods of the invention, can be tested using standard assays well known to those of skill in 
25 the art for their ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be 
selected and screened at random or rationally selected or designed using protein modeling 
techniques. 

30 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents 

and the like are selected at random and are assayed for their ability to bind to the protein 
encoded by the ORF of the present invention. Alternatively, agents may be rationally 
selected or designed. As used herein, an agent is said to be "rationally selected or designed" 
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when the agent is chosen based on the configuration of the particular protein. For example, 
one skilled in the art can readily adapt currently available procedures to generate peptides, 
pharmaceutical agents and the like, capable of binding to a specific peptide sequence, in 
order to generate rationally designed antipeptide peptides, for example see Hurby et al., 

5 Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W.H. Freeman, NY (1992), pp. 289-307, and Kaspczak et al., Biochemistry 
28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or 

10 EMFs of the present invention. As described above, such agents can be randomly screened 
or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single 
ORF or multiple ORFs which rely on the same EMF for expression control. One class of 
DNA binding agents are agents which contain base residues which hybridize or form a triple 

1 5 helix formation by binding to DNA or RNA. Such agents can be based on the classic 
phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric 
derivatives which have base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - 

20 see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 241 , 456 (1988); and 
Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-Okano, J. 
Neurochem. 56, 560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene 
Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally results in 
a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks 

25 translation of an mRNA molecule into polypeptide. Both techniques have been 

demonstrated to be effective in model systems. Information contained in the sequences of 
the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention 

30 can be used as a diagnostic agent. Agents which bind to a protein encoded by one of the 
ORFs of the present invention can be formulated using known techniques to generate a 
pharmaceutical composition. 



I 



WO 03/080795 



PCT/US02/25485 



104 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic 
acid hybridization probes capable of hybridizing with naturally occurring nucleotide 
sequences. The hybridization probes of the subject invention may be derived from any of 

5 the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534. Because the corresponding 
gene is only expressed in a limited number of tissues, a hybridization probe derived from 
any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534 can be used as an 
indicator of the presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 

10 hybridization. PCR as described.in US Patents Nos. 4,683,195 and 4,965,188 provides 

additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used 
in PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. 
The probe will comprise a discrete nucleotide sequence for the detection of identical 
sequences or a degenerate pool of possible sequences for identification of closely related 

15 genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such 
vectors are known in the art and are commercially available and may be used to synthesize 
RNA probes in vitro by means of the addition of the appropriate RNA polymerase as T7 or 

20 SP6 RNA polymerase and the appropriate radioactively labeled nucleotides. The nucleotide 
sequences maybe used to construct hybridization probes for mapping their respective 
genomic sequences. The nucleotide sequence provided herein may be mapped to a • 
chromosome or specific regions of a chromosome using well-known genetic and/or 
chromosomal mapping techniques. These techniques include in situ hybridization, linkage 

25 analysis against known chromosomal markers, hybridization screening with libraries or 
flow-sorted chromosomal preparations specific to known chromosomes, and the like. The 
technique of fluorescent in situ hybridization of chromosome spreads has been described, 
among other places, in Verma et al (1988) Human Chromosomes: A Manual of Basic 
Techniques, Pergamon Press, New York NY. 

30 Fluorescent in situ hybridization of chromosomal preparations and other physical 

chromosome mapping techniques may be correlated with additional genetic map data. 
Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265:1981f). Correlation between the location of anucleic acid on aphysical chromosomal 
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map and a specific disease (or predisposition to a specific disease) may help delimit the 
region of DNA associated with that genetic disease. The nucleotide sequences of the subject 
invention may be used to detect differences in gene sequences between normal, carrier or 
affected individuals. 

5 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly 
practiced using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those 

10 of skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy 
is to precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can 
be achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6), 1469- 
72); using UV light (Nagata et al., 1985; Dahlen et al, 1987; Morrissey & Collins, (1989) Mol. 
Cell Probes 3(2) 189-207) or by covalent binding of base modified DNA (Keller et al, 1988; 

15 1989); all references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotm-stieptavidin 
interaction as a linker. For example, Broude et al (1994) Proc. Natl. Acad. Sci. USA 91(8), 
3072-6, describe the use of biotinylated probes, although these are duplex probes, that are 
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be 

20 purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating 
any surface with streptavidin. Biotinylated probes may be purchased from various sources, 
such as, e.g., Operon Technologies (Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. 
Nunc Laboratories have developed a method by which DNA can be covalently bound to the 

25 microwell surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with 
secondary amino groups (>NH) that serve as bridgeheads for further covalent coupling. 
CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules may be bound 
to CovaLink exclusively at the 5'-end by a phosphoramidate bond, allowing immobilization of 
more than 1 pmol of DNA (Rasmussen et al, (1991) Anal. Biochem. 198(1) 138-42). 

30 The use of CovaLink NH- strips for covalent binding of DNA molecules at the 5'-end 

has been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond is 
employed (Chu et al., (1983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as 
immobilization using only a single covalent bond is preferred. The phosphoramidate bond joins 
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the DNA to the CovaLink NH secondary amino groups that are positioned at the end of spacer 
arms covalently grafted onto the polystyrene surface through a 2 nm long spacer arm. To link 
an oligonucleotide to CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus 
must have a 5 f -end phosphate group. It is, perhaps, even possible for biotin to be covalently 
5 bound to CovaLink and then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/jil) and 
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1- 
methylimidazole, pH 7.0 (l-Melmy), is then added to a final concentration of 10 raM l-Melmv. 
A ss DNA solution is then dispensed into CovaLink NH strips (75 ^il/well) standing on ice. 

10 Carbodiimide 0.2 M l-e1hyl-3-(3-dme%lammopro (EDC), 

dissolved in 10 mM l-Melm?, is made fresh and 25 |il added per well. The strips are incubated 
for 5 hours at 50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; 
first the wells are washed 3 times, then they are soaked with washing solution for 5 min., and 
finally they are washed 3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS 

15 heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is 
that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated 
herein by reference. This method of preparing an oligonucleotide bound to a support involves 
attaching a nucleoside 3-reagent through the phosphate group by a covalent phosphodiester link 

20 to aliphatic hydroxyl groups carried by the support. The oligonucleotide is then synthesized on 
the supported nucleoside and protecting groups removed from the synthetic oligonucleotide 
chain under standard conditions that do not cleave the oligonucleotide from the support. 
Suitable reagents include nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 

25 arrays may be employed. For example, addressable laser-activated photodeprotection may be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described 
by Fodor et al (1991) Science 251(4995), 767-73, incorporated herein by reference. Probes 
may also be immobilized on nylon supports as described by Van Ness et al (1991) Nucleic 
Acids Res., 19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) 

30 Anal. Biochem. 169(1), 104-8; all references being specifically incorporated herein. 

To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5-amine of 
oligonucleotides with cyanuric chloride. 
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One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al, (1994) Proc. Natl Acad. Sci., USA 91(1 1), 
5022-6, incorporated herein by reference). These authors used current photolithographic 
techniques to generate arrays of immobilized oligonucleotide probes (DNA chips). These 
5 methods, in which light is used to direct the synthesis of oligonucleotide probes in high-density, 
miniaturized arrays, utilize photolabile S^protectediV-acyl-deoxynucleoside phosphoramidites, 
surface linker chemistry and versatile combinatorial synthesis strategies. A matrix of 256 
spatially defined oligonucleotide probes may be generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

10 The nucleic acids may be obtained from any appropriate source, such as cDNAs, 

genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC 

inserts, and RNA, including mRNA without any amplification steps. For example, Sambrook 

et al (1989) describes three protocols for the isolation of high molecular weight DNA from 

mammalian cells (p. 9. 1 4-9.23). 
1 5 DNA fragments may be prepared as clones in M13, plasmid or lambda vectors and/or 

prepared directly from genomic DNA or cDNA by PGR or other amplification methods. 

Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA 

samples may be prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of 
20 skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 of 

Sambrook et al (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1990). 

Nucleic Acids Res. 18(24), 7455-6, incorporated herein by reference). In this method, DNA 

samples are passed through a small French pressure cell at a variety of low to intermediate 
25 pressures. A lever device allows controlled application of low to intermediate pressures to the 

cell. The results of these studies indicate that low-pressure shearing is a useful alternative to 

sonic.and enzymatic DNA fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the 

two base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1992) Nucleic Acids 
30 Res. 20(14) 3753-62. These authors described an approach for the rapid fragmentation and 

fractionation of DNA into particular sizes that they contemplated to be suitable for shotgun 

cloning and sequencing. 
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The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the 
specificity of this enzyme (CV/JI**), yield a quasi-random distribution of DNA fragments form 
the small molecule pUC19 (2688 base pairs). Fitzgerald et al (1992) quantitatively evaluated 
5 the randomness of this fragmentation strategy, using a Cv/JI** digest of pUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z 
minus M13 cloning vector. Sequence analysis of 76 clones showed that Cvz'JI** restricts 
pyGCPy and PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated 
at a rate consistent with random fragmentation. 

10 As reported in the literature, advantages of this approach compared to sonicajion and 

agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ^g instead of 
2-5 jag); and fewer steps are involved (no preligation, end repair, chemical extraction, or 
agarose gel electrophoresis and elution are needed). 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, 

15 it is important to denature the DNA to give single stranded pieces available for hybridization. 
This is achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is 
then cooled quickly to 2°C to prevent renaturation of the DNA fragments before they are 
contacted with the chip. Phosphate groups must also be removed from genomic DNA by 
methods known in the art. 

20 4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon 
membrane. Spotting may be performed by using arrays of metal pins ^the positions c^f which 
correspond to an array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a 
DNA solution to a nylon membrane. By offset printing, a density of dots higher than the density 

25 of the wells is achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the 
type of label used. By avoiding spotting in some preselected number of rows and columns, 
separate subsets (subarrays) may be formed. Samples in one subarray may be the same genomic 
segment of DNA (or the same gene) from different individuals, or may be different, overlapped 
genomic clones. Each of the subarrays may represent replica spotting of the same samples. In 

30 one example, a selected gene segment may be amplified from 64 patients. For each patient, the 
amplified gene segment may be in one 96-well plate (all 96 wells containing the same sample). 
A plate for each of the 64 patients is prepared. By using a 96-pin device, all samples may be 
spotted on one 8 x 12 cm membrane. Subarrays may contain 64 samples, one from each patient. 
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Where the 96 subarrays are identical, the dot span may be 1 mm 2 and there may be a 1 mm 

space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, 

Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the 
5 membrane, the grid being similar to the sort of membrane applied to the bottom of multiwell 

plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by exposure 

to flat phosphor-storage screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of 

the present disclosure, one of skill in the art will appreciate that many other embodiments and 
10 variations may be made in the scope of the present invention. Accordingly, it is intended that 

the broader aspects of the present invention not be limited to the disclosure of the following 

examples. The present invention is not to be limited in scope by the exemplified embodiments 

which are intended as illustrations of single aspects of the invention, and compositions and 

methods which are functionally equivalent are within the scope of the invention. Indeed, 
1 5 numerous modifications and variations in the practice of the invention are expected to occur to 

those skilled in the art upon consideration of the present preferred embodiments. Consequently, 

the only limitations which should be placed upon the scope of the invention are those which 

appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated 
20 by reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from 
25 various human tissues and in some cases isolated from a genomic library derived from human 
chromosome using standard PCR, SBH sequence signature analysis and Sanger sequencing 
techniques. The inserts of the library were amplified with PCR using primers specific for the 
vector sequences which flank the inserts. Clones from cDNA libraries were spotted on nylon 
membrane filters and screened with oligonucleotide probes (e.g., 7-mers) to obtain signature 
30 sequences. The clones were clustered into groups of similar or identical sequences. 
Representative clones were selected for sequencing. 
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In some cases, the 5 f sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied 
Biosystems (ABI) sequencer to obtain the novel nucleic acid sequences. 

5 5.2 EXAMPLE 2 

Assemblage of Novel Contigs 

The contigs of the present invention, designated as SEQ ID NO: 2083-2534 were 
assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend the 
seed EST into an extended assemblage; by pulling additional sequences from different 

10 databases (i.e., Hyseq's database containing EST sequences, dbEST, gb pri, and UniGene, and 
exons from public domain genomic sequences predicated by GenScan) that belong to this 
assemblage. The algorithm terminated when there were no additional sequences from the 
above databases that would extend the assemblage. Further, inclusion of component sequences 
into the assemblage was based on a BLASTN hit to the extending assemblage with BLAST 

1 5 score greater than 300 and percent identity greater than 95%. 

Table 8 sets forth the novel predicted polypeptides (including proteins) encoded by the 
novel polynucleotides (SEQ ID NO: 2083-2534) of the present invention, and their 
corresponding translation start and stop nucleotide locations to each of SEQ ID NO: 2083-2534. 
Table 8 also indicates the method by which the polypeptide was predicted. Method A refers to 

20 a polypeptide obtained by using a software program called FASTY (available from 

http://fastabioch.virginia.edu) which selects a polypeptide based on a comparison of the 
translated novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in 
Enzymology, 183:63-98 (1990), herein incorporated by reference). Method B refers to a 
polypeptide obtained by using a software program called GenScan for human/vertebrate 

25 sequences (available from Stanford University, Office of Technology Licensing) that predicts 
the polypeptide based on a probabilistic model of gene structure/compositional properties (C. 
Burge and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by reference). 
Method C refers to a polypeptide obtained by using a Hyseq proprietary software program that 
translates the novel polynucleotide and its complementary strand into six possible amino acid 

30 sequences (forward and reverse frames) and chooses the polypeptide with the longest open 
reading frame. 
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Novel Nucleic Acids 

The novel nucleic acids of the present invention SEQ ID NO: 1-1041 were assembled 
from Hyseq f s proprietary EST sequences as described in Example 1 and human genome 
5 sequences that are available from the public databases aittp://www.ncbi.nlm.nih.gov/) < 
Exons were predicted from human genome sequences using GenScan 
OittD://genes.mit.edu/GENSCANinfo.htmn : HMMgene 

fhttp://wwxbs.dtu.d^ l.htmp ; and GenMark.hmm 

(littp://genemark.biologv.gatechxdu/GeneMark/whmm info.html) . The Hyseq proprietary 

1 0 EST sequences and the predicted exons were assembled based on a BLASTN hit to the 

extending assemblage with BLAST score greater than 300 and percent identity greater than 
95%. Then, the predicted genes were analyzed using Neural Network SignalP VI. 1 program 
(from Center for Biological Sequence Analysis, The Technical University of Denmark) for 
presence of a signal peptide. These sequences were further analyzed for absence of a 

1 5 transmembrane region using the TMpred program 

flit1p://ww.ch.embnet.org/software/TMPRED form.htmD . 

Table 1 shows the various tissue sources of SEQ ID NO: 1-1041. 
The homologs for polypeptides SEQ ID NO: 1042-2082, that correspond to 
nucleotide sequences SEQ ID NO: 1-1041 were obtained by a BLASTP version 2.0al 19MP- 

20 WashU searches against Genpept release 124 using BLAST algorithm. The results showing 
homologues for SEQ ID NO: 1042-2082 from Genpept 124 are shown in Table 2. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. 
Comp. Biol., Vol. 6, 219-235 ( 1999\ http://motif.stanford.edu/ematrix-search/ herein 
incorporated by reference), all the polypeptide sequences were examined to determine 

25 whether they had identifiable signature regions. Scoring matrices of the eMatrix software 
package are derived from the BLOCKS, PRINTS, PFAM, PRODOM, and DOMO 
databases. Table 3 shows the accession number of the homologous eMatrix signature found 
in the indicated polypeptide sequence, its description, and the results obtained which include 
accession number subtype; raw score; p-value; and the position of signature in amino acid 

30 sequence. 

Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 
26(1) pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences 
were examined for domains with homology to certain peptide domains. Table 4 shows the 
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name of the Pfam model found, the description, the e-value and the Pfam score for the 
identified model within the sequence. Further description of the Pfam models can be found 
at http://pfam.wustl.edu/ . 

The GeneAtlas™ software package (Molecular Simulations Inc. (MSI), San Diego, 
5 CA) was used to predict the three-dimensional structure models for the polypeptides 

encoded by SEQ ID NO 1-1041 (i.e. SEQ ID NO: 1042-2082). Models were generated by 
(1) PSI-BLAST which is a multiple alignment sequence profile-based searching developed 
by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) High Throughput Modeling 
(HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) which is an automated sequence 

10 and structure searching procedure ( http://www.msi.com/) , and (3) SeqFold™ which is a fold 
recognition method described by Fischer and Eisenberg (J. Mol. Biol. 209, 779-791 (1998)). 
This analysis was carried out, in part, by comparing the polypeptides of the invention with 
the known NMR (nuclear magnetic resonance) and x-ray crystal three-dimensional structures 
as templates. Table 5 shows: "PDB ID", the Protein DataBase (PDB) identifier given to 

15 template structure; "Chain ID", identifier of the subcomponent of the PDB template 

structure; "Compound Information", information of the PDB template structure and/or its 
subcomponents; "PDB Function Annotation" gives function of the PDB template as 
annotated by the PDB files (http:/www.rcsb.org/PDB/) ; start and end amino acid position of 
the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold score, and the 

20 Potential(s) of Mean Force (PMF). The verify score is produced by GeneAtlas™ software 
(MSI), is based on Dr. Eisenberg's Profile-3D threading program developed in Dr. David 
Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and Eisenberg, Nature, 
356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc, Natl. Acad. Sci. USA, 
95:13597-12502. The verify score produced by GeneAtlas normalizes the verify score for 

25 proteins with different lengths so that a unified cutoff can be used to select good models as 
follows: 

Verify score (normalized) = (raw score - 1/2 high score)/(l/2 high score) 

30 The PFM score, produced by GeneAtlas™ software (MSI), is a composite scoring 

function that depends in part on the compactness of the model, sequence identity in the 
alignment used to build the model, pairwise and surface mean force potentials (MFP). As 
given in table 5, a verify score between 0 to 1.0, with 1 being the best, represents a good 
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model. Similarly, a PMF score between 0 to 1.0, with 1 being the best, represents a good 
model. A SeqFold™ score of more than 50 is considered significant. A good model may 
also be determined by one of skill in the art based all the information in Table 5 taken in 
totality. 

5 Table 6 shows the position of the signal peptide in each of the polypeptides and the 

maximum score and mean score associated with that signal peptide using Neural Network 
SignalP VI. 1 program (from Center for Biological Sequence Analysis, The Technical 
University of Denmark). The process for identifying prokaryotic and eukaryotic signal 
peptides and their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, 
10 Soren Brunak, and Gunnar von Heijne in the publication " Identification of prokaryotic and 
eukaryotic signal peptides and prediction of their cleavage sites" Protein Engineering, Vol. 
10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and a mean 
S score, as described in the Nielson et al reference, was obtained for the polypeptide 
sequences. 

15 Table 7 correlates each of SEQ ID NO: 1-1041 to a specific chromosomal location. 

Table 9 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 1- 
1041, their corresponding polypeptide sequences SEQ ID NO: 1042-2082, their 
corresponding priority contig nucleotide sequences SEQ ID NO: 2083-2534, their 
corresponding priority contig polypeptide sequences SEQ ID NO: 2535-2986, and the US 
20 serial number of the priority application in which the contig sequence was filed. 

Table 10 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 1- 
1041, the novel polypeptide sequences SEQ ID NO: 1042-2082, and the corresponding SEQ 
ID NO in which the sequence was filed in priority US application 60/31 1,261. 
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Table 1 



"Tissue Origin 


RNA/Tissue Source 


Library Name 


SEO ID NO: 


adrenal gland 


Clontech 


ADR002 


13 23 34 45 77 111 115 122 187 
194 210-21 1 249-250 255 290 

mass *\ *\ f n n ./"'n jaa >l J n /If 4 

320 357-358 362 420 443 451 
492 499 551 577 630 698 702 
713 718 805 808 819 841-843 
845 861 896 899 909 924 937 
949 985 1037 


adult bladder 


Invitrogen 


BLD001 


9 87 189 320-321 358 563 768 
840 970 


adult brain 


Clontech 


ABR001 


184-186 277 282 352 558 849 
871 898 958 


adult brain 


Clontech 


ABR006 


30 45 170 199 210 226 260 292- 
294 340 357 413 443-444 478 
499 551-552 579 582 584-588 
632-637 646 654-655 676 683 
731-732 755-756 777 813-827 
861 872 874 880 883 1002 1012 


adult brain 


Clontech 


ABR008 


15 45 54 61 67 81 87 101 106 
108 122-123 143-144 170 181- 
183 195-209 215 222 245-248 
261-270 283-289 292-293 296 
306 308-310 327 340 358 370 
394-407 409 421 428 440 442 
459 477-478 496 531-547 551- 
552 556 565-566 578-579 606 
618 620-621 629-630 651 653- 
655 664 667-668 707 713-714 
729 745 750 753 756 772 779 
788 790 793-794 799-800 802 
808 812 823 826-827 849-850 
859 862 872 883 885 898 917 
919 921 930 935-936 947 974 
985-986 992 1002 1006 1012 
1028 1030 1036 1039 


adult brain 


Clontech 


ABR011 


1012 


adult brain 


GIBCO 


AB3001 


23 57-58 67 85 296 492 499 579 
853 898-899 950 1012 


adult brain 


GIBCO 


ABD003 


45 59-62 67 72 82 85-88 156 
179-180 182 296 299 355-356 
440 458 474 483 499 563 823 
840 852 860 885 898 992 999 
1012 


adult brain 


Invitrogen 


ABR014 


45 115 238 470 599 653 974-976 


adult brain 


Invitrogen 


ABR015 


ac S" t\t\ nor 1 Ato 

45 600 885 1012 


adult brain 


Invitrogen 


ABR016 


599 1012 


adult brain 


Invitrogen 


ABT004 


34 45 54 74 84 118 138-143 170- 
171 180-181208 255 277 359 
379 428 438 499 501 536 715 
731 783 793 799 805 809 824 
862 898 912 977 998 1012 


adult cervix 


BioChain 


CVX001 


23 26 48 54 57 67 77 118 121 
177 183 238 255 271-272 296 
303 311-319 325 352 361-362 
41 1-412 419-420 424 428 440 
447 478 541 567 569 599-600 
622 699 793 805 813 831 836- 
837 839 844-845 848 863 872 
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'Tissue Origin 


RN A/Tissue Source 


Library Name 


SEQ ID NO: 








913 928-929 944 958 965 970 
973 1001 1004 


adult colon 


Invitrogen 


CLN001 


250 322-325 429 630 788 970 
985 


adult heart 


GEBCO 


AHR001 


28-30 45 6167 90-94 118 122 
150-151 183 193 250-251 279 
349-351 369-370 410 419 474 
483 485 490493 552 563 719 
773 835-836853 861 961 976 
1030 


adult kidney 


GIBCO 


AKD001 


24 3 1-34 44-46 48 55 62 67 81 
121 144 151 162 176-178 183 
251 255 258277 352 358 369- 
370 386 408 420 429 483 490 
536 546 579 599-600 602 645 
698 793 805 874 898 913 


adult kidney 


Invitrogen 


AKT002 


32 53-54 67 85 177 251 260 341 
386 408 419-420 431-436 478 
490 493 507 561 582 596-599 
698 728 788 805 819 837 844- 
848 885 898 969 989 1013 


adult liver 


Clontech 


ALV003 


101 121 193 579 638-639 729 
890-893 919 1007 1017 


adult liver 


Invitrogen 


ALV002 


75 157 173 183 212-214 236 240 
263 292 323 335 386 408 415 
495-499 552 577 589 599 727 
782 858 869 898-900 924 968 


adult lung 


GIBCO 


ALG001 


67 77 152 369 386419 443 483 
583 732 849907 


adult ovary 


Invitrogen 


AOV001 


5 26 34 43 45 48 55 61-62 64-67 
77 87 101-102 105 115 118 122- 
129 143 151 155-163 170 174- 
175 177 181-183 193 251-252 
286 292 338 347 353-354 369 
381 410 415 420 424 451 458 
483 489 497499 515 536 541 
546 552 577 579 595 599-600 
604 647 658 661 665 699 744 
782-783 800 805-806 814 831 
835 839-840844 853 874 895 
898-899 913 924 929 941-942 
949 973 977 994 1004 1007 1012 
1016 1031 1037 


adult placenta 


Clontech 


APL001 


67 419 688 728 848 930 


adult spleen 


Clontech 


SPLcOl 


82 101 187 255 260 358 370 447 
483 489 579 586648 768 835 
845 848 853-857 863 885 913 
917 962 986 


adult spleen 


GIBCO 


ASP001 


87 105 108 122 158 172 215 299 

ooa At\o Arm co cnn iqg 
380 492 4yy 552 593 oil /oD 

830 840 850 889 


adult testis 


GIBCO 


ATS001 


68-69 106 183 251 301 360 386 
520 541 570753 788 832 840 
890 916 


bone marrow 


Clontech 


BMD001 


10-12 16-19 24-26 35 46 48 58 
77 85 95-96 98-99 122 156 164 
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Table 1 



'Tissue Origin 


RNA/Tissue Source 


Library Name 


SEQIDNO: 








172 187 222 251 385 424 429 
458 478 483 489 519 568-569 
599 622-623 630-631 696 700 
758 765 794 844 914 919 924 
944 971 985 992 1001 1017 


bone marrow 


GF 


BMD002 


23 45 81-82 104-105 115 136 
144 156 170172-173 181 183 
247 287 292 306 319-320 327 

n OTA A 1 O A1Q A Ol A OA /iCJO 

362 370 418 47o-4o3 4oV 4yz 
536 548-552 565 569-570 572 
579 596 599 614-622 630 640- 

rA-\ CA1 £C1 C£ZQ /SQ1 /COO TAG 

641 643 ooi ooo byl Wif /Uo 
715-718 726 743 756 758 772 
789 841889 917 920947 958 
994 1006 1010 1037 1039 


cultured preadipocytes 


Stratagene 


ADP001 


121 255 400 490-494 51 1 629 
689 758793 835 861913 944 

f\Af\ AO A 

949 984 


endothelial cells 


Stratagene 


EDT001 


34 45 54 58 67 120-122 144 151- 

1 CA 1 Ol 1 CS1 OOO AA(\ A^ 1 

154 loJ lyo lyy ooD 44U 4.M 

a cq AQ1 AQf\ AQQ ^ 1 ^ ^^9 **/v} 
4 jo hod Hy\) *tyy D1J jOj 

569 577 579 599 622-623 752 
793 800 844-845 898-899 942 
944 949 


fetal brain 


Clontech 


FBR001 


139168 356 599 702 712 831 
845 850 872-873 898 921 1037 


fetal brain 


Clontech 


FBR004 


138 168250 363 873-875 882 


fetal brain 


Clontech 


FBR006 


14 29 45 51 81 87 101 104118 
131 143-144157 171 177 206 
208-209 215 229 238 251 261 
273 279 283 291-293 326-332 
358 362 370-371 397 400 402 
413 419428 461472 485 551- 
560 568-569 579 618 620 ozy- 
630 653-657 659-661 663-673 
675 700 714 739-742 744-746 
766 779793 809 815 819 822 

840 850 859 862 872 875-885 

niA aco <\n doc i ooo 1 nn/> 
y30 yjo y 12. yyj 1UUZ 1UUO 1UZO 

1030-1031 1038 


fetal brain 


GIBCO 


HFB001 


13-15 54-57 62 67 70-72 84 121 

1 HA 1 11 1 OA 1 Q2 A 1 H A 1 7 AO A 
1/4 1 / / loU ioD 41U 41 / 4Z*f 

485 518 520 542 552 578-579 
599 785 793 805 831-832 840 
858 871 883 898-899 977 1012 


fetal brain 


Invitrogen 


FBT002 


7 45 49 144-149 157 180 255 263 
356 493 501 600 630 707 748 
832 845 858 913 1012 


fetal heart 


Invitrogen 


FHR001 


2445 81-82 104 114-115 118 
121 144 152 181 239 247 288 
292 327 362 370 381 419 428 
444 453 458 478 486 493 503 
569 571 576 582.596 618 640 
668 674-688 719-722 731 744 
753 762 772 784 794 819 823 
836 850 885 914 944 949 957- 
958 1017 
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Table 1 



"Tissue Origin 


RN A/Tissue Source 


Library Name 


SEO ID NO: j 


fetal kidney 


Clontech 


FKD001 


82 107 208 458 483 485 536 758 

fl1Q SQd. 1017 

/OU OlV OOO 07*t LVJL 1 


fetal kidney 


Clontech 


FKD002 


1 m 1 1 5n i rq ?3R ?47 9fT3 

01 1U1 10O iOJf ZJO / auj 

292 327 340 370 405 416 419 

517 ^£0 67fl fAR fififl 

J 1 / DOy jOQ WHO UUO U07 

691 731 746-752 763 771-772 
787-788 819 840 842 854 861 
872 944 958 961 969 


fetal kidney 


Invitrogen 


FKD007 


116 


fetal liver 


Clontech 


FLV002 


iin jf<*)n /1</1 £00 £Q<t *7A/1 'TCI 

410 42y 4D4 oyz-oyj /V't lol 
805 894-895 1017 


fetal liver 


Clontech 


FLV004 


67 107 115 118151 187 241255 

oot nnr\ Ana AHQ ^1 Q ^vtQ 

287 370 466 4/8 4yz Me j4B 
552 569 582 589 630 653 668 
696-699 752-757 784 789 805 
885 908 985 


fetal liver / 


Invitrogen 


FLV001 


45 101 130-137 157 222 240 337 
386 428-429 492 552 589 693 
727 840 


fetal liver-spleen 


Columbia 
University 


FLS001 


1-9 18 20-23 27 34 36-38 45 55 
67 70 83 89 94 118 122 158 164 
172-173 177 183 219 238 240 
246 251292 299 323 335 338 
358 369 376 385-386 397 408 
416419 421-422 429 451 456- 
460 466 472 478 483 489-490 
493 516 536 543 546 551 569- 
573 579 586 588-589 593-595 
599-603 619 622 668 6/6 oyi 
699 702 724 731 734 743 787 ; 
789 794 800 805 834-835 840 
848 853 874 880 885 890-891 ! 
899 908 910 923 926-927 930 
939-940 944 949 958 973 980 
992 999 1004 1007 1009 1013 


fetal liver-spleen 


Columbia 
University 


FLS002 


3 8 17 22 36-37 46 55 61 63 70 
72 85 89-90 94 106 122 148 156 
158 165 172 177181 194 213 
215 219 246 251 292 299 304- 
307 323-324 338 346 355 366 
371 374 380-381 386 392 397 
410 417 421 440 455 462-464 
466-468 489-490 492-493 507- 
521 536 552 565-566 569 571- 
576 592 596 599 619 630 650 

/rr C.C\ £00 JCAO /COO TIO *71fi 

655 661 688 o98-oyy 112. Ilo 
723-729 731 735-737 753 767 
783 824 831 834 840 845 871 
885 891 894 899 902 906-909 
913 923-930 940 943 949 958 
973 980 992 999 1003 1007 1017 
1032 1040-1041 


fetal liver-spleen 


Columbia 
University 


FLS003 


23 67 106 150 158 193 338 374 
376 411443 478493 546 565 
569-570 582 589 609-613 630 
661 699 724 727-734 767 809 
812 834-835 845 880 890 910 
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'Tissue Origin 


RNA/Tissue Source 


Library Name 


SEQ ID NO: 








929-930 958 973 980 985 1013 


fetal lung 


Clontech 


FLG001 


nnft A 1 AAO 

728 824 1008 


fetal lung 


Clontech 


FLG004 


115 668 


fetal lung 


Invitrogen 


FLG003 


120 183 322 333-336 476 516 
691 831 835 850 1012 


fetal muscle 


Invitrogen 


FMS001 


45 338-339 365 369 386 429 431 
496-497 789 793 856 970 1008 
1019 1033 1035 


fetal muscle 


Invitrogen 


FMS002 


45 1 15 171 247 327 365 370 405 
536 642-652 668 710-711 719 
726 758-761 765 836 899 901 
907 913 948 965 1037 


fetal skin 


Invitrogen 


FSK001 


29 57 67 74 81 118 152 177 180 
193 294 340-342 345 375 397 
419 437-443 445-45 1 454 475 
532 541 546 565 598 604 630 
650 668 728 742 772 789 793 
804-805 823 828-830 837 840 
849 899 901 922 958 970 1007 
1022 1033 


fetal skin 


Invitrogen 


FSK002 


34 45 77 8185 115 173 200 279 
292-293 360 370 381 419 428- 
429 451 466490 551 569-570 
579 600 604 630 647 668 698 
700-706 729 73 1 746 750 758 
762-766 768-773 780 794 840 
850 859 861 885 901 911 913 
957 961 965 973 1038 


fibroblast 


Stratagene 


LFB001 


55 72 143 255 490 502-505 587 
599 627 861 863 885 984 1037 


induced neuron-cells 


Stratagene 


NTD001 


30 82111 124181206 356392 
410 417 484-488 578 831-834 
898 977 1036 1039 


infant brain 


Columbia 
University 


ffi2002 


18 2145 6673-75 100-103 118 
152 168-171,177 180 241-242 
252 292-295 340 345 366-367 
413 438 454 499 501 542 561- 
562 578-580 599 668 702 728- 
729 745 765 768 772 793 796- 
799 823-824 863 874 887 899 j 
948-949 967 975 977 98 1 983 
992 995 1012 


infant brain 


Columbia 
University 


IB2003 


81 101 113 118 177 180 241 252 
293 340 345 367 371 379 381 
400 417 499-501 536 562 578 
580-581 629-630 702 713 745 
796-805 824 831 837 840 845 
874 885 967 977 981 985 1012 
1030 


infant brain 


Columbia 


TDlxyfOfi? 
JJ31VJ.UUZ 


iOO JJO *TlJ-*rl*T ?1J 




University 






infant brain 


Columbia 
University 


rosooi 


415 417 533 581 886-888 977 


leukocyte 


Clontech 


LUC003 


77 619 889 949 


leukocyte 


GIBCO 


LUC001 


34 36 38-42 50-52 55 67 77 81- 
83 85 121 137 144 158 172 183 
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Table 1 



"Tissue Origin 


RNA/Tissue Source 


Library Name 


CTTO m NO* 








223 226 251 254 258 291 324 
368-374 378 424 429 443 483 
492 536 552 564 600 602 732 
ldt\ 7fiR 782 785 805 838 844- 
845 848 850 889 898 905 908 
946 973 992 _ 


lung 


55 72 143 255 490 
502-505 587 599 
627 861 863 885 
984 1037 






lung tumor 


Invitrogen 


LGT002 


55 61 65 77-79 82 102 105 115 

i 5/C 1 57 1 <5 1 £7 1 7H 1 R9 1 Rl 
IjO-ij/ 103-10/ i l\J loZ-ioJ 

197 243-244 251 253 296-297 

195 17H 1R6 41 8-410 491-49S, 

478 483 492 499 520 531 533 
541 569 577 582 600 788 844- 
845 848 874 899 911913 916- 
918 939 944 949 956 970 976 


lymph node 


Clontech 


A T XTAH 1 

ALN0U1 


Al fi3 1 04-1 05 1 83 483 492 691 
894 1017 


lymphocytes 


ATCC 


LPC001 


45 51 77 1 5R 101 9S1 109 421 

*fj Jj / / 1 JO 17J ^Jl 'ta.A 

455 469-474 483 507 536 546 
579 581 618 621 640 765 780- 
787 793 838 845 875 924 968 
978 999 


macrophage 


Invitrogen 


HMP001 


122 147 157 183 251 255 493 
738 898-899 903-905 


mammary gland 


Invitrogen 


MMG001 


45 64 67 83-84 101 113 143 148 

1 CO 1 co 177 Iftl 1 Rl 1 RQ 
IjZ Ijo I04 Hi 151-loJ 107 

216-218 253 255 258 263 274 
299 336 419 421 423 426-430 

AACi AAA 47ft 4Qfi 590 511 516 
4-4 U 400 4- to UK) JZU jjj JjO 

564 569 579 582 630 646 753 

7AR 7R9 7RQ R00 R1S R40 848 

/Oo /OZ. /07 OV/V/ ojj o*tv/ o*tu 

850 883 912-913 944 950 958 


melanoma from-cell-line- 
ATCC-#CRL-1424 


Clontech 


MEL004 


69 1 5R 1 R1 90S ^62 364 402 419 

1 JO lOl Z.yO JUZ. JU*T "TV*- TA-' 

515 536 896-897 958 973 1004 
1008 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGdOlO 


353 358 823 942 982 1020 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGdOll 


569 630 944 955 999 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGd012 


9 38 59 63 80 85 122-123 152 j 
154 177 195 217 232 246 250 

70£ inn Ififi 191 194 1R1 497 

Z"0 jUU juO JZJ-JA** jOI *r£ / | 

434 438-439 478 489 499 507 

517 51ft 55ft 565 571 57S 6^0 
J 1 / JJO J JO JOJ J / 1 J / J OJv 

657 681 701 736 762 792 800 
802 823-824 861 871-872 899 
929 941 955 968 974 985-1003 
1006 1011-1012 1033 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGd013 


232 434 748 956-958 992 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGd015 


18 69 115 324 335 548 551 569 
582 600 622 731 819 899 911 
944 957-958 1012 1017-1018 
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Table 1 



'Tissue Origin 


RN A/Tissue Source 


Library Name 


SEO ID NO: 


♦Mixture of 16 tissues - 
mRNA 


Various Vendors 


CGd016 


46 172 183 323 371481493 565 

5A0 571 5Q£ £30 654 6QR 

joy j / 1 jyo jyy oou ojf oyo 
745 762 786 849 907 944 1004- 
1013 1037 1039 


neuronal cells 


Stratagene 


\Trpr TAA1 

NTU001 


/ DD 4j 1U/ llO 1Z1 lJ\J LOO ZOO 

385 440 478 483 485 487 489 
536569 582 756 768 772 819 

QK QAA 05Q OAA 1 OH1 
o30 y44 yjo yOO 1UU1 


pituitary gland 


Clontech 


PIT004 


1 <Q OOO 755 "3/15 75£ 770 
1 jo ILL Ljo 34j jjO J/U j/y 

569 579 819 831 861-862 885 

COS 077 1 01 7 
oyo yLL 11/1/ 


placenta 


Clontech 


PLA003 


7 36 61 279 419 478 489 582 586 

5QO 6A\ AA7 AAR fSL\ 707-71 1 
J77 041 04/ OOo Ool /V/-/A1 

774-779 1001 


placenta 


Invitrogen 


APL002 


57 173 536 728 793 800 


prostate 


Clontech 


PRT001 


o^ oin ooo OOO /1 10 €OQ AAK *7/co 

26 219-222 22y 412 oyy ODD /oz 
835 837 860 878 951 1031 


rectum 


Invitrogen 


REC001 


9 292 343-346 431 546 714 800 
ooi yio 


retinoic acid-induced- 
neuronal-cells 


Stratagene 


NTR001 


112 400478 569 582 629 756 j 

75Q QOO Q10 Q71 835 836 850 
/jo oUU oiy 5J1 ojj-ojO ojw 

906 944 958 


salivary gland 


Clontech 


SAL001 


58 6177 118150 158 294 347- 
348 483 492-493 546 752 830 
915 


skeletal muscle 


Clontech 


SKM001 


80118 247 365 483 719 805 812 
823 


small intestine 


Clontech 


SIN001 


34 3745 52 60 93 106 119 121 
138 144 177 180 208 223-225 
238 247 294 323 335-336 343 
362 370 380 386 397 409-4 1 1 

A 4 y A /-\ /~\ A A f\ a r ■% Arc /ino yl OA 

416 420 440 451 455 478 489 
493 536 571 577 579 590 602 
604-608 614 622 624-628 655 
668 688 700 714 805-812 831 
841 872 894 899 914 924 926 

929 958 961 965 973 991 y9o 

inn 
1017 


spinal cord 


Clontech 


SPC001 


51 164 182-183 190226-228 
255-257 275-277 286 296 299 
451454 542 552 579 591 728 
753 770 786 790 831 835 849- 

857 808 007 058 1000 1017 


stomach 


Clontech 


STO001 


72 222 232 247 258 366 645 


thalamus 


Clontech 


THA002 


45 49113 155 164 180 183 191- 
192 208 229-232 238 345 417 

AA1 CIO CCQ C07 /CIA 778 

443 JlZ DM J jo oyJL OoU /Zo 

800 823 840 858-860 885 898 
976 1012 


thymus 


uioniecn 


THMOfll 

IXXLVXvVl 


45 141 160 183 258 360 378-379 
418 451460 569 602 619 731 
788-790 819 835 845 958 965 
1004 


thymus 


Clontech 


THMc02 


47 108 115 121 144 157173 247 
259-260 300 327 340 358 362 
375-393 409 453 455 461 478- 



WO 03/080795 



PCT/US02/25485 



121 
Table 1 



"Tissue Origin 


RNA/Tissue Source 


Library Name 


SEQ ID NO: 








479 489 551 565 569-570 579 
582 615 630 640 653 668 708 
744 752 758 766 790-795 810 
819 823 835-836 845 850 853 
861 885 911 919 938 958 962 
994 1001 1027 


thyroid gland 


Clontech 


THR001 


46 58 67 80 82 144 160 177 183 
193-194 233-235 251 255 263 
268 278-280 286 299 301-303 

r\ a C O 1 TA *> O £ O A*7 A AO A 1 A 

324 358 370 386 397 408 410 
420 440 474 483 493 506 519- 
520 533 594 599-600 602 658 
661 719 758 772 785 788 793 
830 851 853 864-867 898 904 
909 924 929 961 973 991 998 
1001 1009 


trachea 


Clontech 


TRC001 


45 154 236 238 281 323 416 571 
602 868-869 913 


umbilical cord 


BioChain 


FUC001 


3445 54 58 6770 85 152 154 
177 180 188 208 251299 370 

Al\t\ A 1 C A\(\ A*) A ACt ACC AQ1 

409 415 419 434 451-455 4o3 
596 599 647 661 733 742 793 
8ft8 8^9-840 84S 849-850 861 

OUO OJ7 O^u O^J 0"7 OJV OUi 

888 911 913 992 


uterus 


Clontech 


UTR001 


177 237-239 255 258 417 493 
520 567 599 604 646 844 870 
874 898 973 


young liver 


GIBCO 


ALV001 


45 419 440 443 490 653 732 753 
805 845 898 904 



♦The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA (Invitrogen), 4) Normal adult liver 
mRNA (Invitrogen), 5) Normal fetal kidney mRNA (Invitrogen), 6) Normal fetal liver mRNA (Invitrogen), 7) 
normal fetal skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) Human bone marrow 
mRNA (Clontech), 10) Human leukemia lymphoblastic mRNA (Clontech), 1 1) Human thymus mRNA 
(Clontech), 12) human lymph node mRNA (Clontech), 13) human so\spinal cord mRNA (Clontech), 14) 
human thyroid mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human conceptional 
umbilical cord mRNA (BioChain). 
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Table 2 



SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 


1044 


AAB32400 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 30 SEQ ID 
NO: 86. 


339 


100 


1044 


AAM74711 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protem bbQ 
ID NO: 35017. 


335 


100 


1044 


AAM61909 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 34014. 




1 nn 


1045 


gi3859599 


Arabidopsis 
thaliana 


similar to class I chitmases (Pfam: 
PF00182, E=1.2e-142, N=l) 


n a 

74 


ll 


1045 


gil5292107 


Drosophila 
melanogaster 


LD38671p 


74 


n 1 


1045 


gi2258324 


Fusarium 
oxysporum f. sp. 
ciceris 


yellowing-associated protein 


73 


32 


1046 


gil7428204 


Ralstonia 
solanacearum 


CONSERVED HYPOTHETICAL 
PROTEIN 


74 


32 


1046 


gi43 14432 


Homo sapiens 


similar to phosphatidylinositol 
(4,5)bisphospnate 5-pnospnatase; 
match to PID:gl399105 


71 


30 


1046 


gi|17545909| 
reflNP 5193 
11.11 


Ralstonia 
solanacearum 


CONSERVED HYPOTHETICAL 
PROTEIN 


74 


32 


1047 


gi9756017 


Actinoplanes sp. 
50/110 


alpha-amylase 


69 


38 


1047 


gi|6572499|g 
b|AAF17291 

•11 


Homo sapiens 


LHX3 protein 


tin 
67 


zo 


1047 


gi|l 8572988| 
ref|XP 0291 
70.2| 


Homo sapiens 


LIM homeobox protein 3 


67 


26 


1048 


AAY28474- 


Homo sapiens 


UYJO Human Capon protein. 


721 


99 


1048 


gi2895555 


Homo sapiens 


carboxyl-terminal PDZ ligand of 
neuronal nitric oxide synthase 


721 


99 


1048 


gi2895557 


Rattus 
norvegicus 


carboxyl-terminal PDZ ligand of 
neuronal nitric oxide synthase 


/CCA 

054 




1049 


gil97 13721 


Fusobacterium 
nucleatum subsp. 
nucleatum 
ATCC 25586 


GTP-binding protein era 


oo 




1050 


gD1291 


Homo sapiens 


fumarylacetoacetase (AA 1-349) 


175 


70 


1050 


gil 82393 


Homo sapiens 


fiimarylacetoacetate hydrolase 


1 / J 


70 


1050 


gil2803409 


Homo sapiens 


fiimarylacetoacetate 


175 


70 


1052 


gi4680089 


Human 

immunodeficienc 
y virus type 1 


envelope glycoprotein 


79 


26 






JZ»piiyUaUa 

fluviatilis 


EFPDE2 


74 


20 


1052 


gi4679590 


Human 

immunodeficienc 
y virus type 1 


envelope glycoprotein 


74 


25 


1054 


gi3844648 


Mycoplasma 
genitalium 


glycerol kinase (glpK) 


71 


28 



WO 03/080795 



PCT/US02/25485 



123 
Table 2 



SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 


1054 


gil8448155 


Ipomoea leaf 
curl virus 1 


AC3 


70 


27 


1054 


gi|12044888| 
ref|NP 0726 
98.1| 


Mycoplasma 
genitalium 


glycerol kinase (glpK) 


71 


28 


1056 


AAM56747 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28852. 


229 


72 


1056 


AAM67067 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27373. 


224 


69 


1056 


AAM54664. 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26769. 


224 


69 


1058 


gi|13310191| 
gb|AAK181 
89.1|AF331 
500_1 


multiple 

sclerosis 

associated 

retrovirus 

element 


recombinant envelope protein 


228 


79 


1058 


gi|21 103962| 
gb|AAM331 
41.1| 


Homo sapiens 


enverin-2 


209 


77 


1058 


gi|8272468|g 
b|AAF74215 
.1|AF15696 
3 1 


Homo sapiens 


envelope protein 


198 


75 


1059 


gi20380199 


Homo sapiens 


Similar to LOC168246 


251 


100 


1059 


gi|8388692|e 
mb|CAB940 
42.1| 


Leishmania 
major 


probable DNA-binding protein 


67 


46 


1060 


gi|21292780| 

gb|EAA049 

25.1| 


Anopheles 
gambiae str. 
PEST 


agCP4203 


70 


39 


1061 


gi330862 


Equine 
herpesvirus 1 


membrane glycoprotein 


179 


30 


1061 


gil7221106 


Equine 
herpesvirus 1 


glycoprotein gp2 


178 


34 


1061 


AAE03643 


Homo sapiens 


INCY- Human extracellular matrix and 
cell adhesion molecule-7 (XMAD-7). 


175 


29 


1062 


gi|l 1037117| 
gb|AAG274 
85.1|AF194 
537 1 


Homo sapiens 


NAG13 


334 


66 


1062 


gi|1335205|e 
mb|CAA364 
80.1| 


Homo sapiens 


ORFH 


332 


66 


1063 


gi21323402 


Corynebacterium 
glutamicum 
ATCC 13032 


ABC-type transporter, periplasmic 
component 


70 


36 


1063 


gi|1955 1869| 
ref|NP 5998 
71.11 


Corynebacterium 
glutamicum 


COG1464:ABC-type uncharacterized 
transport systems, periplasmic 
component 


70 


36 


1063 


gi|1755 1878| 
reflNP 4990 


Caenorhabditis 
elegans 


TPR Domain 


67 


37 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 

THontifv 

laenuiy 




90.1| 










1064 


gi2308977 


Aspergillus 
nidulans 


chitin synthase 


66 


29 


1065 


gil 8076958 


Yarrowia 
lipolytica 


Optl protein 


74 


30 ; 


1065 


gi786145 


Walleye dermal 
sarcoma virus 


envelope polyprotein 


73 


Zo 


1065 


gi2801522 


Walleye dermal 
sarcoma virus 


gprenv 


73 


28 


1066 


gi9294279 


Arabidopsis 
thaliana 


Tal 1-like non-LTR retroelement 
protein-like; CHP-rich zinc finger 
protein-like 


67 


32 


1066 


gi|20848817| 
reflXP 1380 
10.1| 


Mus musculus 


similar to HEAT SHOCK COGNATE 
PROTEIN 80 


83 


69 


1069 


AAM77637 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 37943. 


96 


65 


1069 


AAM64901 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 37006. 


96 




1069 


gi|17473741| 
reflXP 0623 
80.1| 


Homo sapiens 


similar to Meningioma-expressed 
antigen 6/1 1 (MEA6) (MEA1 1) 


1 iz 


DO 


1070 


gi296288 


Homo sapiens 


histone HI 


77 


44 


1070 


gi5923857 


Artemisia annua 


squalene synthase 


75 




1070 


AAO08837 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 22729. 


73 


39 


1071 


gi2 1483554 


Drosophila 
melanogaster 


SD02058p 


72 


29 


1071 


gi8515845 


Homo sapiens 


hepatocellular carcinoma associated 
protein TD26 


«"7 t 

71 


3o 


1071 


gi|21483554| 
gb|AAM527 
52.1| 


Drosophila 
melanogaster 


SD02058p 


72 


29 


1072 


gi5902896 


Streptomyces 
avermitilis 


type I polyketide synthase AVES 4 


74 


ou 


1072 


gi|21301752| 

gb|EAA138 

97.1| 


Anopheles 
gambiae str. 
PEST 


agCP8235 


70 


1A 


1073 


AAV30916_ 
aal 


Homo sapiens 


GEMY Human secreted protein 
AR415 4 cDNA. 


99 


66 


1073 


ABB89113 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 
NO 1489. 


99 


66 


1073 


AAB90679 


Homo sapiens 


GEMY Human AR415_4 protein 
sequence SEQ ID 35. 


99 


66 


1074 


AAG99338 


Homo sapiens 


TAKE Human atypical tachykinin 
protein fragment SEQ ID NO: 20. 


380 ' 


92 


1074 


AAG99336 


Homo sapiens 


TAKE Human atypical tachykinin 
protein fragment SEQ ID NO: 13. 


329 


91 


1074 


AAG99333 


Homo sapiens 


TAKE Human atypical tachykinin 
protein fragment SEQ ID NO: 3. 


324 


91 


1075 


gil7945760 


Drosophila 
melanogaster 


RE33302p 


305 


29 



WO 03/080795 



PCT/US02/25485 



125 
Table 2 



SEQ 
ID 

NO: 


Accession 
No, 


Species 


Description 


Score 


% 

Tripntitv 


1075 


gi 1039447 


Saccharomyces 
cerevisiae 


T WU1 t-» 

1/pDlp 




25 


1075 


AAB 64777 


Homo sapiens 


xlUMA- xiuman secreieu. protein 
sequence encoded by gene 5 SEQ ED 


78 


77 


1076 


AAB50261 


Homo sapiens 


CUKl- xiuman oreast cancer associated 
B726P-20 protein. 


JUO 




1076 


A A T\ C f\ f\ A A 

AAB50244 


Homo sapiens 


CUKJL- Human Dreast cancer associaiea 
B726P-79 protein. 






1076 


AAB84702 


Homo sapiens 


CUKK Amino acid sequence oi a 
human cancer associated antigen. 




1Q 


1077 


gi2529735 


Gorilla gorilla 


glycophorin B/E precursor 


71 
/ 1 




1077 


AAB74724 


Homo sapiens 


INCY- Human membrane associated 
protein MEMAP-30. 


70 


31 


1077 


gi4 164424 


Schizosaccharom 
yces pombe 


similar to yeast cytoskeleton control 
protein Bnilp 


in 




1078 


gil8145107 


Clostridium 
perfringens 


probable transcriptional regulator 


71 


28 


1078 


gi|9581801|e 
mb|CAC005 
46.1| 


Plasmodium 
falciparum 


guanylyl cyclase 


69 


24 


1078 


gi|16805032| 
ref!NP_4730 
6l.l| 


Plasmodium 
falciparum 


Ser/Thr protein kinase 


oy 


ZO 


1079 


gi|2088632l| 
ref|XP 1406 
I4.l| 


Mus musculus 


similar to olfactory receptor, family 5, 
subfamily V, member 1; olfactory 
receptor, family 5, subfamily V 
member 1 


72 


34 


1081 


gi9650824 


Petroselinum 
crispum 


1 A 1 J. -C J C 

common plant regulatory factor 5 


76 


28 


1081 


gi559695 


Hydrolagus 
colliei 


This CDS feature is included to show 
the translation of tie corresponding 
C_region. Presently translation 
qualifiers on C_region features are 
illegal 


Til 

74 


1 1 
31 


1081 


gi476622 


Hydrolagus 
colliei 


immunoglobulin light chain 


74 


31 


1082 


AAM39205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2350. 


363 


71 


1082 


AAO07159 


Homo sapiens 


riiob- riuman polypeptide o&ki jjj 

vrn o i ac i 
1NU Z1UD1. 


3 J / 


76 


1082 


AAM40991 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5922. 


343 


79 


1 083 


gi|17229222| 
refjNP 4857 
70.11 


Nostoc sp. PCC 
7120 


similar to HetF protein 


79 


10 


1084 


gil7221628 


Felis catus 


I -lympnocyte surtace kaji. anngen 


76 
/O 


JO 


1UO*t 




OrimMTi-C"!nniJ'n 

hemorrhagic 
fever virus 


envelone elvcorarotein precursor 


74 


29 


1084 


gi|1722 1628| 
dbj|BAB784 
75.1| 


Felis catus 


T-lymphocyte surface CD2 antigen 


76 


38 


1085 


©17430213 


Ralstonia 


PUTATIVE HEMAGGLUTININ- 


74 


26 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 






solanacearum 


RELATED PROTEIN 






1087 


gi2323287 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


/I o 

618 


79 


1087 


gi|4996596|d 
bj|BAA7854 
9.11 


Human 
endogenous 
retrovirus W 


polyprotein 


317 


74 


1087 


gi|9630708|r 
ef|NP 0472 
55.1| 


Feline leukemia 
virus 


gag-pol precursor polyprotein gPr80 


293 


38 


1088 


gil5075953 


Sinorhizobium 
meliloti 


PUTATIVE MOLYBDENUM 
TRANSPORT SYSTEM PERMEASE 
ABC TRANSPORTER PROTEIN 


70 


56 


1088 


gi2288880 


Arthrobacter 
nicotinovorans 


transmembrane protein 


en 
67 


JO 


1088 


gil 7298547 


Bradyrhizobium 
japonicum 


ModB 


an 
67 


DO 


1089 


AAY95660 


Homo sapiens 


ZYMO Human Zntr2 protein. 


231 


Ol 


1089 


AAU83682 


Homo sapiens 


GETH Human PRO protein, Seq ID No 
182. 


210 


59 


1089 


AAY99386 


Homo sapiens 


GETH Human PRO1305 (UNQ671) 
amino acid sequence SEQ ID NO:153. 


210 


59 


1090 


gi7688355 


Solanum 
tuberosum 


Dof zinc finger protein 


70 


31 


1090 


gi4389445 


Drosophila 
melanogaster 


transcription factor 


67 


32 


1090 


gi|7688355|e 
mb|CAB898 
31.11 


Solanum 
tuberosum 


Dof zinc finger protein 


70 


31 


1092 


AAG78884 


Homo sapiens 


BIOW- Human ribosomal protein s5- 
17. 


r\t\ 

90 


A A 

44 


1092 


AAM91239 


Homo sapiens 


TTTTTk X A TT 

HUMA- Human 

immune/haematopoietic antigen SEQ 
IDNO:18832. 


nn 




1092 


AAM95026 


Homo sapiens 


HUMA- Human reproductive system 
related antigen SEQ ID NO: 3684. 


*TO 

72 


A Q 


1094 


gil 8676450 


Homo sapiens 


FLJ00122 protem 


69 


1 Q 

3o 


1094 


gil 8073428 


Homo sapiens 


stabilin-2 


69 


38 


1094 


gi|20806091| 
ref|NP 0600 
34.8| 


Homo sapiens 


stabilin-2; CD44-like precursor FELL 


69 


38 


1095 


gi20906397 


Methanosarcina 
mazei Goel 


conserved protein 


76 


44 


1095 


gi|2 1299784| 

gb|EAA119 

29.1| 


Anopheles 
gambiae str. 
PEST 


agCP653l 


75 


30 


1095 


gl[l /^4yU40| 

reflNP 5223 
86.1| 


Kaisronia 
solanacearum 


PROTEIN 


i o 




1096 


AAB58317 


Homo sapiens 


ROSE/ Lung cancer associated 
polypeptide sequence SEQ ID 655. 


678 


100 


1096 


gi862600 


Drosophila 
melanogaster 


male-specific lethal- 1 protein 


176 


25 
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1096 


gi601930 


Oryctolagus 
cuniculus 


neurofilament-H 


115 


24 


1097 


AAU83109 


Homo sapiens 


Z^YJVLU INOvei secreted protein 


76 1 


85 


1097 


gi|20348496| 
refjXP 1117 
12.1| 


Mus musculus 


similar to RIKEN cDNA 9030605E16 


72 


57 


1098 


gil8031887 


Mus musculus 


Fanconi anemia complementation 
group G 


77 


29 


1098 


gil2002137 


Mus musculus 


Fanconi anemia group G protein 


77 


29 


1098 


AAB72381 


Homo sapiens 


ubibNU Human nairy ana ennancer oi 
Split homologue amino acid sequence. 


75 


28 


1099 


gi82 17648 


Homo sapiens 


QjD/yrlv.l (mgn-moDiniy group 
i nonnisionc LiiruiiiuouiiAaiy piuicui ± 
like 1) 


159 


70 


1099 


gi58 15432 


Cjaiius gaiius 


ViittTi mr\Ki1it-v/ ormin nrntpin TTN/TCtI 

(llgll IllUL/lilLy glUU^ JJ.lv/lwUm A-A-LVAVJ a 


154 


70 


1099 


gi4140289 


Gallus gallus 


high mobility group 1 protein 


154 


70 


1100 


ABB11527 ! 


Homo sapiens 


HYSE- Human apolipoprotein B 
receptor nomoiogue, id xn^. ioy / » 


84 


26 


1100 


gi487347 


Homo sapiens 


breakpoint cluster region protein 


81 


32 


1100 


gil44050 


Bordetella 
pertussis 


filamentous hemagglutinin 


78 
/ o 


30 


1102 


AAM68946 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29252. 


JZ / 


81 


1102 


AAM79768 


Homo sapiens 


TTT/OT? TT,.„ in ^ CCA 111 XT/"^ 

HYSE- Human protem oiiVJ id jnu 
3414. 






1102 


AAM78784 


Homo sapiens 


HYSE- Human protein Msvj Id jnu 
1446. 


lid 


80 


1103 


AAZ11186_ 
aal 


Homo sapiens 


SAGA Gene encoding transmembrane 
domain containing protein clone 
HP02239. 




fiR 
oo 


1103 


AAD31079_ 
aal 


Homo sapiens 


INCY- Human cornichon protein 

(CUKJN^ CJJINA. 


143 


68 


1103 


AAA ftft 41A 

AAA88439_ 
aal 


Homo sapiens 


Ltd i ii Anuiumour rssxj l o i cl/in/\ 
Mr*n*» TYNJA9mO-1 190 


143 


68 


1104 


ABB07527 


— : 

Homo sapiens 


rWf^V- Unman dnia mptabnliTinf? 

enzyme (DME) (ID: 5643401CD1). 


562 


100 




ABB07515 


riomo sapiens 


TKTfV- "Human rlmo" metaboliziM? 

JLLNv^ X n till P" 11 Uiug mwitti/viMtiug 

pTiTvmp rnivrF^ rn> 8097779CD1V 


562 


100 


1104 




ivlus rnus cuius 


•Tamil v A rvtnrhrome P450 


431 


76 


1107 


gil3542874 


Mus musculus 


Similar to CGI-67 protein 


677 


64 


1107 


AAU81978 


Homo sapiens 


INCY- Human secreted protein SECP4. 


665 


65 


1107 


AAU77137 


Homo sapiens 


MILL- Human alplWbeta hydrolase 
3 861 o polypeptide. 


665 


65 


1108 


gil3620885 


Homo sapiens 


mitochondrial ribosomal protein S6 




100 


1108 


gil3620887 


Mus musculus 


mitochondrial ribosomal protein S6 


284 


82 




ai 10711140 


Fn QfVha cteriiim 

JL UOUUOvlwl 1UUUI 

nucleatum subsp. 
nucleatum 
ATCC 25586 


Fusobacterium outer membrane protein 
family 


79 


28 ; 


1109 


gil 8378673 


Homo sapiens 


PATE 


607 


89 


1109 


gi5305193 


Rattus 
norvegicus 


sperm protein 10 


108 


30 
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1109 


gi969103 


Mus musculus 


mSP-10 _^ 


107 


27 


1 1 1 n 




ID US uxUTUS 


Tpti n qp in 


119 


34 


1 1 1 A 


glj*H J7JO 


~tJ r\-mr\ haim an C 


T DT rprpntor related nrotein 105 


110 


27 


1110 


gil3938519 


Homo sapiens 


low density lipoprotein receptor-related 
protein 3 


110 


27 


1111 

mi 


gll79olU33 




■frrnTte/^rvri+irvn feint f\r *Nj"K AT"S 
UaDov/l ip UUii luLlUl i^ii r\x J 


82 


32 


mi 


gil5425825 


Mus musculus 


tonicity-responsive enhancer binding 
protein 


82 


32 


nil 


,-, .Yftl 1 1 A O 

gio91114o 




Mus musculus 


Transcription iaciur iNr/\x j ij>uiuijju u 


82 


32 


111^ 

1 112 


A AH1 

gioo34473 


Metarhizium 
anisopliae var. 


adenylate cyclase, /vv^ i 


73 


30 


1113 


AAU19759 


Homo sapiens 


HUMA- Human novel extracellular 

matriY -nrntpin Spn IT) Mfk 409 
Ilia III A. (JlUlvLU, x~iy L^\J ~r\jy . 


900 


70 


1113 


gi3171934 


Mus musculus 


neuronal-STOP protein 


886 


52 


1 1 1 "5 

1113 


giz/o90o/ 


Mus musculus 


o jl \Jr proiem 


885 


52 


1 1 1 A 

1114 


■ •« orcn 1 qq | 


Oenococcus oeni 


L/ppr 


72 


41 


1 1 1 c 

1115 


■Q1 1 ft 


Tir a n nti hi I Q r*-**\ 


lUo-IClalCU (UIU5CU 


69 


37 


1 1 1 c 

1115 


gi//oyo3z 


jL/rosopnna 
melanogaster 


rUb-IClalvU aliUgvll 


69 


37 


1115 


gil7862946 


Drosophila 
melanogaster 


SD04477p 


69 


37 


1116 


gi21212948 


Mus musculus 


peroxisomal protein (PeP) 


243 


83 


1116 


gi2347114 


Mus musculus 


CC chemokine receptor-5 


72 


28 


1116 


gi2431976 


Mus musculus 






28 


1117 


gi|20825251| 
refp£P_1319 
98. 1| 


Mus musculus 


similar to RE 1 -silencing transcription 
factor; neuron restrictive silencer 
factor; repressor binding to die X2 box 


77 


40 


1117 


gi|15597871| 
reflNP_2513 
65.l| 


Pseudomonas 
aeruginosa 


probable type II secretion system 
protein 


69 


41 


1118 


gj|38605l3|e 
mb|CAAl35 
74. 1 1 


Mus famulus 


reverse transcriptase 


303 


82 


1118 


gi|3860536|e 
mb|CAA135 
77.1L 


Mus saxicola 


reverse transcriptase 


303 


81 


1118 


gi|3860510|e 
mb|CAA135 
73.1 1 


Musdunni 


reverse transcriptase 


298 


63 


1119 


AAO04758 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 

1NVJ IOOjU. 


234 


59 


1119 


AAM69569 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID jnu: 2yo/3. 


220 


63 


1119 


AAM67717 


Homo sapiens 


MOLE- Human bone marrow 

express CQ prooe enouueu piuiciii ocy 

ID NO: 28023. 


219 


49 


1120 


gi21107877 


Xanthomonas 
axonopodis pv. 
citri str. 306 


cytochrome C 


78 


27 


1120 


gil5292331 


Drosophila 
melanogaster 


LD47230p 


77 


42 


1120 


gil 5072444 


Avian 


phosphoprotein 


72 


38 
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paramyxovirus 6 








1121 


AAB44126 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ EDNO:1571. 


150 


83 


1121 


gi550015 


Homo sapiens 


ribosomal protein L2 1 


150 


83 


1121 


gi619788 


Homo sapiens 


L21 ribosomal protein 


150 


83 


1122 


AAU74448 


Homo sapiens 


OULU- Human protein sequence of 
lysyl hydroxylase 1 (LH1). 


125 


100 


1122 


gil90074 


Homo sapiens 


lysyl hydroxylase 


125 


100 


1122 


gi5817297 


Homo sapiens 


lysyl hydroxylase 1 


125 


100 


1123 


gi21281601 


Caenorhabditis 
elegans 


C. elegans PQN-44 protein 
(corresponding sequence F55A12.9c) 


78 


34 


1123 


gil4578225 


Caenorhabditis 
elegans 


C. elegans PQN-44 protein 
(corresponding sequence F55A12.9b) 


76 


38 


1123 


gi2088669 


Caenorhabditis 
elegans 


C. elegans PQN-44 protein 
(corresponding sequence F55A12.9a) 


76 


38 


1125 


AAU17301 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protein, Seq ID 866. 


344 


88 


1125 


AAE11776 


Homo sapiens 


INCY- Human kinase (PKIN)-10 
protein. 


344 


88 


1125 


AAU17304 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protein, Seq ID 869. 


340 


86 


1126 


AAM41712 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6643. 


152 


96 


1126 


AAM39926 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3071. 


152 


96 


1126 


AAM79067 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1729. 


152 


96 


1127 


AAE02938 


Homo sapiens 


MILL- Human adenylate cyclase 
25678. 


252 


98 


1127 


AAB02006 


Homo sapiens 


TEXA Adenylyl cyclase type II-C2 C2 
alpha domain. 


252 


98 


1127 


gi202752 


Rattus 
norvegicus 


adenylyl cyclase type II 


252 


98 


1128 


AAA94860_ 
aal 


Homo sapiens 


TEXA Human caspase activator Smac 
coding sequence. 


96 


100 


1128 


AAU78447 


Homo sapiens 


UYJE- Inhibitor of apoptosis (IAP) 
protein Smac. 


96 


100 


1128 


AAB26210 


Homo sapiens 


TEXA Human caspase activator Smac. 


96 


100 


1129 


gi3874765 


Caenorhabditis 
elegans 


Similarity to Drosophila acetylcholine 
receptor protein 

(SW:ACHl_DROME), contains 
similarity to Pfam domain: PF00065 
(Neurotransmitter-gated ion-channel), 
Score=296.9, E-value=5e-86, N=3 


97 


30 


1129 


gi6681597 


Yaba monkey 
tumor virus 


similar to vaccinia G8R 


72 


28 


1129 


gi|17548199| 
refjNP 5099 
32.1| " 


Caenorhabditis 
elegans 


acetylcholine receptor 


97 


30 


1130 


gi|17564116| 
ref]NP 5064 
84.1| 


Caenorhabditis 
elegans 


tyrosine-protein kinase 


73 


29 


1131 


gil3925613 


Homo sapiens 


insulinoma-associated protein IA-6 


88 


27 | 


1131 


gil58485 


Drosophila 


son of sevenless protein 


85 


24 
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melanogaster 








1131 


gi7287782 


05-Feb-1998 


synibol=Sos; 

synonym=BG:DS0094 1 .4; 
match=method: M sim4 H , score:" 1000.0", 
desc:"GenBank: :M8393 1 :Drosophila 
melanogaster son of sevenless (Sos) 
mRNA, complete cds. CDS:346..5133; 
PE):gl58485. M , species:"Drosophila 
melanogaster"; 
match=method:"BLASTX u , 
version: 2.0al9MP-WashU [Biuld 
sol2.5-ultra 01:47:30 


85 


24 


1132 


gi9696 


Mytilus edulis 


polyphenols adhesive protein 


75 


25 


1134 


gil3562016 


Plectreurys tristis 


fibroin 2 


72 


29 


1134 


gil 129074 


Bacillus subtilis 


beta-N-acetylglucosaminidase 


69 


28 


1134 


gi2636104 


Bacillus subtilis 


N-acetylglucosaminidase (major 
autolysm) (CWBP90) 


69 


28 


1135 


AAB58870 


Homo sapiens 


HUMA- Breast and ovarian cancer 
associated antigen protein sequence 
SEQ ID 578. 


72 


80 


1135 


gil 1595476 


Homo sapiens 


RPB1 lblbeta protein 


12 


OA 

80 


1135 


AAB44840 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 11. 


69 


45 


1137 


gi206985 


Rattus 
norvegicus 


troponin I 


70 


46 


1137 


gil6945895 


Takifugu 
rubripes 


SUN-like 1 


70 


31 


1137 


gi|8394466|r 
ef]NP 0588 
81.1| 


Rattus 
norvegicus 


troponin I, skeletal, fast 2 


70 


46 


1140 


AAO04998 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 18890. 


111 


96 


1140 


gil99 17538 


Methanosarcina 
acetivorans str. 
C2A] 

[Methanosarcina 
acetivorans C2A 


mttA/Hcfl06 protein 


80 


28 


1140 


gi4959705 


Mus musculus 


fibulin-2 


76 


28 


1141 


gil0141010 


Vesicular 
exanthema of 
swine virus 


non-structural polyprotein 


91 


31 


1141 


gi6566147 


Drosophila 
melanogaster 


large Forked protein 


85 


30 


1141 


gi23 17953 


murid 

herpesvirus 4 


glycoprotein 150 


79 


28 


1142 


AAB54067 


Homo sapiens 


HUMA- Human pancreatic cancer 

antigen protein sequence SEQ ID 
\Tn. cio 


218 


56 


1142 


gil710365 


Mus musculus 


noggin 


89 


29 


1142 


gi21 105761 


Equus cab alius 


noggin 


89 


29 


1143 


gi|21295753| 

gb|EAA078 

98.1| 


Anopheles 
gambiae str. 
PEST 


agCP1560 


69 


26 


1144 


gi505094 


Homo sapiens 


similar to an actin bundling protein, 


127 


35 
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dematn. 






1144 


gi2337952 


Homo sapiens 


ac tin-binding double-zinc-finger 
protein 


IZZ 


JO 


1144 


gi21304227 


Oryza sativa 


ovule development aintegumenta-like 
protein BNM3 


76 


29 


1145 


gi|21298336| 

gb|EAA104 

81.11 


Anopheles 
gambiae str. 
PEST 


agCP2121 


68 


37 


1146 


AAW22049 


Homo sapiens 


INCY- Interferon gamma inducing 
factor-2 (IGIF-2) alternate transcript 
variant. 


T> 1 
III 


1 ftfi 

1UU 


1146 


AAV05368_ 
aal 


Homo sapiens 


SCHE cDNA encoding human 
interleukin- 1 -gamma. 


167 


84 


1146 


AAH78060_ 
aal 


Homo sapiens 


STRD Nucleotide sequence of human 
interleukin 18 (IL-18). 


167 


QA 
84 


1147 


AAY57937 


Homo sapiens 


INCY- Human transmembrane protein 
HTMPN-61. 


1Z3 


1 (\(\ 


1147 


gi|20345904| 
ref|XPJ098 
23.1| 


Mus musculus 


similar to delta-like homolog 
(Drosophila) 


105 


OO 


1148 


gil9069293 


Encephalitozoon 
cuniculi 


similarity to ADP/ATP CARRIER 
PROTEIN 


75 


32 


1148 


gi8978336 


Arabidopsis 
thaliana 


contains similarity to CHP-rich zinc 
finger protein-gene id:K23F3.4 


74 


26 


1148 


gil9716318 


Aspergillus 
flavus 


antigenic cell wall protem MP1 




11 


1149 


gi5456699 


Emericella 
nidulans 


ATP-binding cassette multidrug 
transport protein ATRC 


70 


IS 
JJ 


1149 


gi|20898840| 
ref|XP_1393 
87.1| 


Mus musculus 


similar to HSPC038 protem 


o9 


1 1 


1150 


gi3883128 


Arabidopsis 
thaliana 


arabinogalactan-protein 


96 


32 


1150 


gil7429208 


Ralstonia 
solanacearum 


CONSERVED HYPOTHETICAL 
PROTEIN 


92 


26 


1150 


gi4063766 


Emericella 
nidulans 


chitinase 


91 


27 


1151 


gil3561058 


Homo sapiens 


dJ 1 1 08D 11.1 (novel protem similar to 
C. elegansT22C1.7) 


1U/ 




1151 


gi2 1105299 


Mytilus 

galloprovincialis 


precollagen-NG 


i fK 


zo 


1151 


gll4164347 


Oncorhynchus 
mykiss 


collagen a i(l J 


yo 




1152 


gil8479434 


Mus musculus 


olfactory receptor MOR188-1 


76 


33 


1152 


gi2653915 


Oran virus 


glycoprotein Gl and G2 precursor; 
envelope glycoprotein precursor 


/Z 


A6 


1152 


gil 8479436 


Mus musculus 


olfactory receptor MORI 88-2 


72 


33 




oi^40^ 1 67 




GBAS 


161 


86 


1153 


gil2804791 


Homo sapiens 


glioblastoma amplified sequence 


161 


86 


1153 


AAB57149 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1727. 


134 


81 


1154 


gil7742234 


Agrobacterium 
tumefaciens str. 
C58 (U. 


histidase 


87 


35 
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Washington) 








1154 


gil5 159496 


Agrobacterium 
tumefaciens str. 
udo (hereon) 


AGR_L_1400GMp 


87 


35 


1154 


gil58521 


Drosophila 
melanogaster 


seven-up protein type 2 


80 


32 


1 I JJ 


gl|lU441Djl| 
gD|/vAAjrl /U 

QQ 1 1 API SO 
yy. i j/vr lo:/ 

115 1 


Cryptotermes 
domesticus 


cytochrome b 


65 


28 






nomo sapiens 


xi x jl- numan polypeptide oJca^ id 
>jn?S08i 


4/j 


no 


1156 


pi20147787 


"VprirvniiQ Ini^vic 
j\\j\A\jy) uo law Via 


xiuLicai ict/cpLui coreprcbbur 


Id 


ZD 


1156 


gil9881705 


Oryza sativa 


Putative transposable element 


72 


32 


11S7 




nuuiu oapicno 




fin 


lid 

34 


1157 


AAB93530 


Homo sapiens 


HELI- Human protein sequence SEQ 
H>NO:12884. 


77 


34 


1 157 
1 1 J / 




xiomo sapiens 


fus-like protein 


77 


ill 

42 


1158 


gi9795254 


Sepia officinalis 


GAB A-A receptor beta subunit 


71 


27 


I 1 ^fi 

I I JO 


glljUZOlD/ 


Clostridium 
acetobutylicum 


amidase, germination specific 
(cwlC/cwlD B.subtilis ortholog) 


68 


34 


1 1 <\R 

1 1 JO 


gi|y /yoz >4|g 
.11 


Sepia officinalis 


GAB A-A receptor beta subunit 


71 


27 


1159 


AAB93423 


Homo sapiens 


HELI- Human protein sequence SEQ 

JLD INU:lZo41. • 


336 


100 


1159 


gil3097768 


Homo sapiens 


Similar to RDCEN cDNA 2900073H19 
gene 


336 


ioo ; 


1 1 <T0 


gl2007170o 


Mus musculus 


RIKEN cDNA 2900073H19 gene 


334 


96 


1160 


AAM72558 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 32864. 


274 


100 


1160 


AAM59959 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 32064. 


274 


100 


1161 


AAB07704 


Homo sapiens 


INMR Protein encoded by the 
enaogenenc fragment 01 xlcKV-w. 


139 


36 


1161 


gi8272464 


Homo sapiens 


gag 


139 


36 


1 101 


gip /ZO/3o|g 
U|A ATl/lfi'1'7 

5.1JAF1238 
81 1 


multiple 

sclerosis 

associated 

retrovirus 

element 


gag polyprotein 


131 


35 






xiomo sapiens 


ijnu i - Human maat protem irom clone 

TO»inii^7AA i -inrMWA a vi o 
j^o.iuoozoH-.i.zuuuivi/vx iy. 


1A/Z 

340 


79 


1162 




numu b dpi ciio 


\J fill r4 Untv^AH ifi v\ f% ritifYnr *>%-^a^aiv\ ^ ■ 

d\jucj- xiuman zinc nnger protem o i . 


no 
3iy 


as 

OJ 


1162 


AAB95637 j 


Homo sapiens 


HELI- Human protein sequence SEQ 
E>NO:1837L 


314 


67 


1163 


gil4189950 


Homo sapiens 


connexin 58 


536 


84 


1163 


gi9957542 


Homo sapiens 


connexin 59 


536 


84 


1163 


gil0946367 


Danio rerio 


connexin 55.5 


485 


81 


1164 


gi755700 


Bombyx mori 


sericinlB 


76 


27 


1164 


gil9569861 


Dictyostelium 
discoideum 


RTOA protein (Ratio-A). 


76 


28 
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1164 


gil0580635 


Halobacterium 
sp. NRC-1 


Vngl087c 


76 


1 C 

25 


1165 


gil9915386 


Methanosarcina 
acetivorans str. 
C2A] 

[Methanosarcina 
acetivorans C2A 


WD-domain containing protein 


89 


oo 
28 


1165 


gi5639663 


Homo sapiens 


WD repeat protein WDR3 


83 


28 


1165 


gil 1544739 


Homo sapiens 


dJ776P7.2 (WD repeat domain 3) 


83 


28 


1166 


AAM69338 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29644. 


72 


31 


1166 


AAM56953 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 29058. 


72 


31 


1166 


gi20197507 


Arabidopsis 
thaliana 


expressed protein 


67 


39 


1167 


gi5802812 


Homo sapiens 


Gag protein 


83 


30 


1167 


gi7160650 


Bordetella 
bronchiseptica 


pertactin (P. 68) 


79 


31 


1167 


gil3 173444 


Bordetella 
bronchiseptica 


pertactin 


79 


31 


1168 


gil495029 


Danio rerio 


protein kinase CK2 alpha 1 


84 


24 


1168 


gi643443 


Penicillium 
chrysogenum 


PHOG 


82 


32 


1168 


gi|18858419| 
ref]NP 5713 
15.1| 


Danio rerio 


casein kinase 2 alpha 2 


84 


24 


1169 


gi206716 


Rattus 
norvegicus 


salivary proline-rich protein 


90 


31 


1169 


gil5029903 


Mus musculus 


Similar to proline-rich protein BstNI 
subfamily 2 


89 


36 


1169 


gi53182 


Mus musculus 


proline rich protein 


81 


34 


1170 


gi|17553370| 
ref|NP_4983 
18.1| 


Caenorhabditis 
elegans 


F40H6.5.p 


78 


33 


1170 


gi|15215731| 
gb[AAK914 
11.1| 


Arabidopsis 
thaliana 


AT4g36780/C7A10_580 


73 


30 


1171 


gi340446 


Homo sapiens 


zinc finger protein 7 (ZFP7) 


218 


61 


1171 


AAB43928 


Homo sapiens 


www "f» AT A " J_ J 

HUMA- Human cancer associated 
protein sequence SEQ ID NO:1373. 


216 


CO 

58 


1171 


AAB21040 


Homo sapiens 


INCY - Human nucleic acid-binding 
protein, NuABr-44. 


213 


A O 

48 j 


1172 


AAE04368 


Homo sapiens 


INCY - Human kinase (PKIN)-9. 


120 


0<T 


1172 


AAM79153 


Homo sapiens 


HYSE- Human protein SEQ ED NO 
1815. 


1 OA 

120 


Of 

85 


1 1 T) 




riomo sapiens 


PTTP A Hitman nnvpl STF9ft-1il«» 

protein, NOV-3& 


120 


85 


1173 


gi218572 


Pan troglodytes 


prot GOR 


74 


29 


1173 


gi243898 


Pan 


GOR 


74 


29 


1173 


gil666473 


Mus musculus 


NOV protein 


71 


50 


1174 


gi5901830 


Drosophila 
melanogaster 


BcDNA.GH07910 


74 


31 
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1174 


AAM80237 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3883. 


71 


38 


1174 


ABB11528 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO: 1898. 


71 


38 


1175 


gi|12054759| 
emb|CAC20 
748.1| 


Podospora 
anserina 


catalase A 


65 


33 


1176 


AAM93289 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 2777. 


145 


100 


1176 


gil7431512 


Ralstonia 
solanacearum 


PUTATIVE OUTER MEMBRANE 
CHANNEL LIPOPROTEIN 
TRANSMEMBRANE 


71 


26 


1176 


gil5823991 


Streptomyces 
avennitilis 


modular polyketide synthase 


70 


51 


1177 1 


AAM41939 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6870. 


84 


61 


1177 


gi870751 


Homo sapiens 


N-acetylgalactosamine 6-sulfate 
sulfatase (GALNS) 


84 


61 


1177 


gi618426 


Homo sapiens 


N-acetylgalactosamine 6-sulphatase 


84 


61 


1178 


gi435855 


Mus sp. 


CREB-binding protein; CBP 


89 


22 


1178 


AAW40058 


Homo sapiens 


USSH Cellular transcriptional factor 
CBP. 


87 


22 


1178 


gil7944308 


Drosophila - 
melanogaster 


RE12101p 


86 


26 


1179 


AAM25814 


Homo sapiens 


HYSE- Human protein sequence SEQ 
IDNO:1329. 


73 


93 


1179 


AAM25290 


Homo sapiens 


HYSE- Human protein sequence SEQ 
IDNO:805. 


73 


93 


1179 


AAM79441 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3087. 


73 


93 


1180 


AAB88388 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC01 3 1 . 


719 


97 


1180 


gi20810493 


Homo sapiens 


Similar to RIKEN cDNA 2810417M05 
gene 


716 


96 


1180 


AAD30543_ 
aal 


Homo sapiens 


MILL- Human B7RP-2 DNA. 


83 


38 


1181 


ABB 14686 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 3343. 


190 


97 


1181 


gil4329731 


Secale cereale 


high molecular weight glutenin subunit 

X 


88 


27 


1181 


gil4329761 


Triticum 
aestivum 


high molecular weight glutenin subunit 

X 


84 


26 


1182 


gil 1692645 


Mus musculus 


aspartly beta-hydroxylase 


74 


28 


1182 


gill878112 


Mus musculus 


aspartyl beta-hydroxylase 6.6 kb 
transcript 


74 


28 


1182 


gill878110 


Mus musculus 


aspartyl beta-hydroxylase 4.5 kb 
transcript 


74 


28 


1183 


gil 5485622 


Homo sapiens 


Q9H4T4 like 


OA 

80 


ZD 


1183 


gil9714949 


Fusobacterium 
nucleatum subsp. 
nucleatum 
ATCC 25586 


TonB protein 


78 


32 


1183 


gi7717375 


Homo sapiens 


human CHD2-52 down syndrome cell 
adhesion molecule 


71 


23 
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1184 


AAU83667 


Homo sapiens 


GETH Human PRO protein, Seq ID No 
152. 


388 


100 


1184 


AAG89161 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 281. 


388 


100 


1184 


AAY99348 


Homo sapiens 


GETH Human PROl 194 (UNQ607) 
amino acid sequence SEQ ID NO:29. 


388 


100 


1185 


AAB93506 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 12830. 


543 


100 


1185 


AAB87570 


Homo sapiens 


GETH Human PR01268. 


426 


95 


1185 


AAY78808 


Homo sapiens 


PROT- Hydrophobic domain 
containing protein clone HP 10537 
protein sequence. 


426 


95 


1187 


gil5823978 


Streptomyces 
avermitilis 


modular polyketide synthase 


75 


41 


1187 


AAB66657 


Homo sapiens 


HSCR- Human elastin protein without 
signal peptide. 


71 


39 


1187 


AAY69137 


Homo sapiens 


UNSY Amino acid sequence of a 
human tropoelastin derivative. 


71 


39 


1188 


gi6907090 


Oryza sativa 

(japonica 

cultivar-group) 


Similar to Oryza sativa root-specific 
RCc3 mRNA. (L27208) 


76 


30 


1188 


AAY36063 


Homo sapiens 


GEST Extended human secreted 
protein sequence, SEQ ED NO. 448. 


74 


26 


1188 


AAY35971 


Homo sapiens 


GEST Extended human secreted 
protein sequence, SEQ ID NO. 220. 


73 


26 


1189 


gi9827989 


Leishmania 
major 


possible CG12797 protein 


72 


36 


1189 


gi|l3625467| 
gb|AAK350 
68.l| 


Leishmania 
donovani 


LACK protective antigen 


68 


27 


1190 


gil702707l 


Xiphocentron sp. 
UMSP00002937 
2-Costa Rica 


elongation factor- 1 alpha 


107 


27 


1190 


gi3l0665 


Strongylocentrot 
us purpuratus 


Nf-Y-A subunit 


88 


24 


1190 


gi2l743 


Triticum 
aestivum 


high molecular weight glutenin subunit 
lAxl 


86 


23 


1191 


gil6878287 


Homo sapiens 


Similar to C-tenninal modulator protein 


167 


96 


1191 


gil58667l4 


Homo sapiens 


C-terminal modulator protein 


167 


96 


1191 


AAO06984 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 20876. 


132 


83 


1192 


AAD05496_ 
aal 


Homo sapiens 


HUMA- Human secreted protein- 
encoding gene 5 cDNA clone 
HHBCS39, SEQ ID NO:15. 


859 


100 


1192 


AAE01707 


Homo sapiens 


HUMA- Human gene 5 encoded 
secreted protein HHBCS39, SEQ ID 
NO:119. 


859 


100 


H92 


AAEOlOVO 


Homo sapiens 


xiUiVLA- riuman gene j encoded 
secreted protein HHBCS39, SEQ ID 
NO:88. 




L\J\J 


U93 


gil8650588 


Homo sapiens 


retinoic acid early transcript 1 


1312 


99 


1193 


AAB15540 


Homo sapiens 


INCY- Human immune system 
molecule from Incyte clone 3402252. 


1283 


97 


H93 


ABB84887 


Homo sapiens 


GETH Human PR079 1 protein 


1234 


94 
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sequence SEQ ID NO: 142. 






1195 


gil 196427 


Homo sapiens 


gag 2 protein 


248 


50 


1195 


gil780975 


Human 
endogenous 
retrovirus K 


gag protein 


248 


50 


1195 


gil556397 


Human 
endogenous 
retrovirus K 


gag 


248 


50 


1196 


gi556256 


Leishmania 
donovani 


G protein alpha subunit 


72 


22 


1197 


AAY07237 


Homo sapiens 


ISTF Wild type monocyte chemotactic 
protein 2. 


121 


100 


1197 


AAY05300 


Homo sapiens 


ISTF C-C chemokine, MCP2. 


121 


100 


1197 


AAW42072 


Homo sapiens 


INCY- Human MC proprotein. 


121 


100 


1198 


ABB57423 


Homo sapiens 


HUMA- Human secreted protein 
encoding polypeptide SEQ ED NO 69. 


187 


79 


1198 


ABB57394 


Homo sapiens 


HUMA- Human secreted protein 
encoding polypeptide SEQ ID NO 40. 


187 


79 


1198 


AAY59757 


Homo sapiens 


META- Human normal ovarian tissue 
derived protein 34. 


187 


79 


1199 


AAY72603 


Homo sapiens 


INCY- Human Electron Transfer 
Protein, ETRN-1. 


155 


100 


1199 


AAB88465 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0259. 


155 


100 


1199 


AAE03926 


Homo sapiens 


HUMA- Human gene 29 encoded 
secreted protein HTADC63, SEQ ID 
NO:89. 


155 


100 


1200 


gi6458884 


Deinococcus 
radiodurans 


chorismate mutase/prephenate 
dehydratase 


73 


42 


1201 


gi20803920 


Mesorhizobium 
loti 


HYPOTHETICAL PROTEIN 


68 


32 


1201 


gi|17545158| 
refiNP 5185 
60.1| 


Ralstonia 
solanacearum 


PUTATIVE LIPASE/ESTERASE 
PROTEIN 


66 


31 


1202 


AAM67586 


. Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27892. 


69 


30 


1202 


AAM55191 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 27296. 


69 


30 


1202 


gi849219 


Saccharomyces 
cerevisiae 


Prolp: Glutamate 5-kinase (Swiss Prot 
accession number P32264) 


69 


33 


1203 


gil8676554 


Homo sapiens 


FLJ00174 protein 


269 


84 


1203 


gi|20913341| 
ref|XP 1267 
63.1| 


Mus musculus 


similar to FU00174 protein 


125 


81 


1203 


gi|20850247| 
refpCP 1366 
64.1| 


Mus musculus 


similar to proline-rich protein 


121 


33 


1204 


AAM68056 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 28362. 


140 


84 


1204 


AAM55676 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 


140 


84 



WO 03/080795 



PCTYUS02/25485 



137 
Table 2 



SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 

IUC11 111 J 








NO: 27781. 






1205 


gi541624 


Drosophila 
virilis 


pdm2 


71 


39 


1205 


gi9955855 


Aspergillus 
oryzae 


RNA polymerase II largest subunit 


69 


38 


1205 


gi662296 


Rattus 
norvegicus 


MIBP1 


68 


32 


1206 


ABB50703 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 52 SEQ ID JNU.o^ 1 . 


260 


94 


1206 


AAW88802 


Homo sapiens 


HUMA- Polypeptide fragment encoded 
by gene 52. 


260 


94 j 


1206 


ABB50706 


Homo sapiens 


HUMA- Human secreted protein 
encoded bv eene 52 SEQ ID NO:654. 


143 


96 


1207 


AAM79588 


Homo sapiens 


HYSE- Human protem SEQ ID IMU 
3234. 


11 


41 

Hi 


1207 


AAM78604 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1266. 


72 


41 


1207 


AAB58944 


Homo sapiens 


HUMA- Breast and ovarian cancer 
associated antigen protein sequence 
SEQ ID 652. 


11 


41 


1208 


AAE03429 


Homo sapiens 


HUMA- Human gene 3 encoded 
secreted protem HETDB76, dEQ ID 

XT/^l .111 

NO: 112. 1 


S.1K 




1208 


gil9110438 


Homo sapiens 


polycystin- 1 L 1 


J / J 




1208 


AAE03463 


Homo sapiens 


HUMA- Human gene 3 encoded 
secreted protem HE 1 DE / o, oEQ id 
NO: 146. 


185 


97 


1209 


gi6760015 


Homo sapiens 


brain protein 


1 114 


85 


1209 


gil747306 


Mus musculus 


SDR2 


151 


31 


1209 


gi20381292 


Mus musculus 


stromal cell aenvea iactor receptor z 


1^1 


31 


1211 


gil4043211 


Homo sapiens 


Similar to RIKEN cDNA 4931428F04 
gene 


460 


89 


1211 


gil90508 


Homo sapiens 


salivary proline-rich protein precursor 


113 


28 


1211 


gil2862320 


Homo sapiens 


IT TIN f*y ■* A £ 

WDC146 


1 AO 




1212 


AAO14407 


Homo sapiens 


FARB Human 11 beta-hydroxysteroid 
dehydrogenase 1-like enzyme. 


291 


63 


1212 


AAM79592 


Homo sapiens 


HYSE- Human protem oEQ lu inu 
3238. 


917 
Lit 


4^ 


1212 


gi4581319 


Homo sapiens 


dJ2oO 1 U.3(iioD 1 1 d 1 (nyaroxysteroia 
(11-beta) dehydrogenase 1) 


917 


4^ 


1213 


AAR06514 


Homo sapiens 


STRI Natural human Platelet Factor- 
4varl encoded by EcoKi firagment. 


238 


64 


1213 


gi292390 


Homo sapiens 


platelet factor 4 


238 


64 


1213 


AAZ28361_ 
aal 


Homo sapiens 


SMIK Platelet factor-4 (PF-4) 
nucleotide sequence. 


200 


56 


1214 


AAD12580_ 
aal 


Homo sapiens 


SAGA Human protein having 
hydrophobic domain encoding cDNA 
rlnnp TTP 10753 


162 


82 


1214 


AAD08193_ 
aal 


Homo sapiens 


HUMA- Human secreted protein- 
encoding gene 3 cDNA clone 
HNTAC64, SEQ ED NO: 13. 


162 


82 


1214 


AAD05544_ 
aal 


Homo sapiens 


HUMA- Human secreted protein- 
encoding gene 12 cDNA clone 
1 HNTAC64, SEQ ID NO:63. 


162 


82 
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1215 


gi2 1429094 


Drosophila 
melanogaster 


LD38004p 


354 


49 


1215 


gil5292155 


Drosophila 
melanogaster 


LD40717p 


O C A 

354 


49 


1215 


AAG75596 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:6360. 


f\r\ A 

294 


50 


1216 


^i7248894 


Xenopus laevis 


Arg protein-tyrosine kinase 


O A 

84 


35 


1216 


gi402191 


Mus musculus 


HNF-3beta 


80 


26 


1216 


gi404764 


Mus musculus 


fork head related protein 


80 


26 


1218 


AAM39205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2350. 


559 


74 


1218 


AAO03505 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 17397. 


502 


81 


1218 


AAM40991 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5922. 


467 


66 j 


1220 


AAO01188 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 15080. 


248 


86 


1220 


AAY73334 


Homo sapiens 


INCY- HTRM clone 1805061 protein 
sequence. 


79 


35 


1220 | 


gi20249 


Oryza sativa 


et-2 


77 


32 


1221 


gi4519619 


Haliotis discus 


collagen pro alpha-chain 


90 


28 


1221 


gi7380690 


Neisseria 

meningitidis 

Z2491 


UDP-N-acetylglucosamine--N- 
acetylmuramyl-(pentape 
pyrophosphoryl-undecaprenol N- 
acetylglucosamine transferase 


90 


37 


1221 


gi7225645 


Neisseria 

meningitidis 

MC58 


UDP-N-acetylglucosamine— N- 
acetylmuramyl-(pentapeptide) 
pyrophosphoryl-undecaprenol N- 
acetylglucosamine transferase 


90 


37 


1222 


ABA05334_ 
aal 


Homo sapiens 


MILL- Human fucosyltransferase 
family member 32132 coding 
sequence. 


2154 


99 


1222 


AAM47905 


Homo sapiens 


MILL- Human fucosyltransferase 
family member 32132. 


2154 


99 


1222 


ABA05333_ 
aal 


Homo sapiens 


MILL- Human fucosyltransferase 
family member 32132 encoding cDNA. 


2154 


99 


1223 


AAY21852 


Homo sapiens 


INCY- Human signal peptide- 
contianing protein (SIGP) (clone ID 
2652271). 


150 


100 


1223 


AAY48563 


Homo sapiens 


META- Human breast tumour- 
associated protein 24. 


150 


100 


1223 


AAW75103 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 47 clone HMCBP63. 


150 


100 


1224 


AAM67078 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27384. 


517 


99 


1224 


AAM54676 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26781. 


517 


99 


1224 


gil7467358 


Sus scrofa 


MDF2 suppressor 


184 


80 


1225 


gi9454237 


Cochliobolus 
sativus 


DNA binding protein MAT-1 


73 


30 


1225 


gi2 1428792 


Drosophila 
melanogaster 


GH03582p 


72 


38 
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1225 


gi6633838 


Arabidopsis 
thaliana 


F2K11.15 


70 


31 


1226 


gi2l430l24 


Drosophila 
melanogaster 


HL01222p 


76 


28 


1226 


AAM77437 


Homo sapiens 


MOLE- Human bone marrow ! 
expressed probe encoded protein SEQ 
ID NO: 37743. 


72 


33 


1226 


AAM64659 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 36764. 


72 


33 


1227 


AAM50715 


Homo sapiens 


MILL- Human TRP-like calcium 
channel-5 (TLCC-5). 


243 


83 


1227 


gi|20874l83] 
ref]XP 1310 
03.l| 


Mus musculus 


similar to hornerin 


80 


29 


1227 


gi|l78647l7| 
gb|AAKl57 
9l.l| 


Mus musculus 


hornerin 


80 


29 


1229 


gi40l9247 


Ateline 
herpesvirus 3 


thymidine kinase 


71 


46 


1229 


gi2760368 


Drosophila 
melanogaster 


Sharpei/DRhoGEF2 


70 


26 


1229 


gil7862944 


Drosophila 
melanogaster 


SD04476p 


70 


26 


1230 


gi4559296 


Mus musculus 


silencing mediator of retinoic acid and 
thyroid hormone receptor extended 
isoform 


80 


30 


1230 


gil8l8l872 


Mus musculus 


GATA-2 protein 


78 


41 


1230 


gil80335ll 


Rattus 
norvegicus 


transcription factor GATA-2 


78 


41 


1231 


gil336550l 


Cyprinus carpio 


integrin beta2-chain 


75 


27 


1231 


gi3322933 


Treponema 
pallidum 


DNA ligase (lig) 


73 


32 


1231 


gi|l336550l| 
dbj|BAB39l 
30.l| 


Cyprinus carpio 


integrin beta2-chain 


75 


27 


1232 


AAM79791 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3437. 


78 


35 


1232 


AAM78807 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1469. 


78 


35 


1232 


AAB19338 


Homo sapiens 


INCY- Amino acid sequence of a 
human fibrous protein (FIBR). 


78 


35 


1233 


AAU21459 


Homo sapiens 


HUMA- Human novel foetal antigen, 
SEQ ID NO 1703. 


87 


26 


1233 


gil508l227 


Arabidopsis 
thaliana 


glycine-rich protein GRP20 


75 


37 


1233 


gi2645433 


Homo sapiens 


CHD3 


74 


30 


1234 


A AT TOI CHtZ 

AAU8367o 


Homo sapiens 


fiRTTT Wnmnn PRO nrntpin Spfl TD No 

170. 


178 


97 


1234 


ABB84911 


Homo sapiens 


GETH Human PR01244 protein 
sequence SEQ ID NO: 190. 


178 


97 


1234 


AAB62403 


Homo sapiens 


CURA- Human MBSP7 polypeptide 
(clone 3499605.0.64). 


178 


97 


1235 


ABB10348 


Homo sapiens 


HUMA- Human cDNA SEQ ID NO: 


409 


61 
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656. 






1235 


AAU18012 


Homo sapiens 


HUMA- Human immunoglobulin 
polypeptide abQ ID No 157. 


178 


83 


1235 


ABB89226 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 
NO 1602. 


78 


82 


1236 


gil 0566951 


Rattus 
norvegicus 


s-gicerin/MUC18 


85 


45 


101/; 


gll03OO94y 


Rattus 
norvegicus 


l-gicenn/MUC18 


85 


45 


1236 


AAB90798 


Homo sapiens 


NOJI/ Human shear stress-response 

a * OT7/— \ T l'\ \TA f\£ 

protem SEQ ID NO: 96. 


84 


42 


1238 


gi21464300 


Drosophila 
melanogaster 


GH20068p 


95 


36 


1238 


gi3868879 


Xenopus laevis 


Zic-related-2 


88 


35 


1 


gllo41 lot 


Mus mus cuius 


GATA-5 cardiac transcnption factor 


87 


52 


1 0IQ 


gil /y4ozoo 


Drosophila 
melanogaster 


Kc6179Jp 


96 


A A 

40 


LZJy 




Gallus gallus 


formin binding protein 1 1 -related 
protein 


91 


27 






African swine 
fever virus 


pU407JL 


88 


*> A 

30 






Homo sapiens 


7V AIT T TT...,,i n .. T A XTA A A C~J _ • 

MILL- Human TANGO 457 protem. 


1331 


100 


1240 


AAE05303 


Homo sapiens 


MILL- Human mature TANGO 457 
protein. 


1207 


100 


1240 


AAE05305 


Homo sapiens 


MILL- Human TANGO 457 protein 
cytoplasmic domain. 


1201 


100 


1 1 At 

1241 


gi5640lll 


Lycopersicon 
esculentum 


RAD23 protein 


84 


25 


1241 


gil7l3l739 


Nostoc sp. PCC 
7120 


polyketide synthase type I 


76 


33 


1241 


gi|5640lll|e 
mb[CAB515 
44.11 


Lycopersicon 
esculentum 


RAD23 protein 


84 


25 


1242 


AAG03496 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7577. 


67 


39 


1242 


gi|13876270| 
gb|AAK260 


Mus musculus 


protocadherin alpha 8 


66 


35 


1243 


AAE16665 


Homo sapiens 


MILL- Human calcium channel family 
member, 21784 protein. 


196 


87 




A A"D*O1y10 

AAxJ0z24o 


Homo sapiens 


WARN Human calcium channel 
alpnazdelta subumt. 


196 


87 


1243 


AAY92320 1 


Homo sapiens i 


WARN Human alpha-2-delta-C 
calcium channel subumt polypeptide. 


196 


87 


\*>A A 

1244 


gi|4102990|g 

b|AAD0l63 

7.l| 


Aspergillus 
nidulans 


DNA polymerase epsilon homolog 


70 


30 


1945 




u$a mays 


extens in-like protein 


Ail 

94 


26 


1245 


gil9481644 


shrimp white 
spot syndrome 
virus 


WSSV052 


89 


36 


1245 


gil7016928 


shrimp white 
spot syndrome 
virus 


wsvOOl 


89 


36 
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No. 


Species 


Description 


score 


% 

/o 

Identity 


1246 


AA012623 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 25515. 


169 


69 


1246 


AA012822 


Homo sapiens 


jtiYob- Human polypeptide oeki jul* 
NO 26714. 


1 5^ 


75 


1246 


AAO02255 


Homo sapiens 


HYSE- Human poiypepuae oni^ U-J 
NO 16147. 


i zj 


65 i 


1247 


gil653353 


Synechocystis 
sp. PCC 6803 


nodulation protein 


75 


OR 
zo 


1247 


Ki4468626 


Mus museums 


TEF-5 j 


1A 
/H 


OA 
zo 


1247 


gil7430764 


Ralstonia 
solanacearum 


SKWP PROTEIN 5 


1A 
l*\ 




1248 


gil5139973 


Sinorhizobium 
meliloti 


CONSERVED HYPOTHETICAL 
PROTEIN 


77 


47 


1249 


gi7191078 


Leishmania 
major 


L712.2 


99 


29 


1249 


gil7384256 


Homo sapiens 


mucinS 


85 


31 • 


1249 


gi5821153 


Homo sapiens 


RNA binding protein 


OJ 




1250 


AAY36495 


Homo sapiens 


HUMA- Fragment of human secreted 
protein encoded by gene 27. 


124 


86 


1250 


AA012122 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 26014. 


1 01 


Q1 > 
y 1 


1250 


AAB95063 


Homo sapiens 


HELI- Human protein sequence bJb^ 
IDNO:16901. 


101 
1Z1 


on 


1252 


gi|15839838| 
ref|NP 3348 
75.1| 


Mycobacterium 

tuberculosis 

CDC1551 


membrane protein, MmpL family 


A8 
Oo 


07 
z / 


1254 


AAG00399 


Homo sapiens 


GEST Hmnan secreted protein, brtvj U-> 
NO: 4480. 


10B 
j/o 


L\)\J 


1254 


gi21428466 


Drosophila 
melanogaster 


LD22609p 


OJ 


Ah 


1254 


gil9914274 


Methanosarcina 
acetivorans str. 
C2A 


sensory transduction histidine kinase 
[Methanosarcina 


Of 
OJ 


OA 
ZO 


1256 


gil4 161094 


Choloepus 
didactylus 


von Willebrand Factor 


80 


24 


1256 


gil4161092 


Cyclopes 
didactylus 


von Willebrand Factor 


78 


23 


1256 


gil3872552 


Acomys 
cahirinus 


von Willebrand Factor 


77 


23 


1258 


gi7008025 


Callithrix 
jacchus 


prochymosin 


715 


64 


1258 


gil 1990126 


Camelus 
dromedarius 


chymosin 






1258 


gi491952 


synthetic 
construct 


preprochymosin 


618 


56 


1259 


gi|21402709| 
ref]NP_6586 

04 11 


Bacillus 

anthracis A2012 


AMP-binding, AMP-binding enzyme 
[Bacillus anthracis 


72 


34 


1260 


gi|4505431|r 
eflNP 0025 
10.1| 


Homo sapiens 


nuclear protein, ataxia-telangiectasia 
locus; NPAT gene; E14 gene 


64 


33 


1260 


gi|l 5309894| 
ref|XP 0408 
46.2| 


Homo sapiens 


similar to nuclear protein, ataxia- 
telangiectasia locus; NPAT gene; E14 
gene 


64 


33 
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No. 


Species 
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Score 


Identity 


1260 


gi|1304114|d 
bj|BAA1186 
l.H 


Homo sapiens 


NPAT 


64 


ii 
33 


1261 


gi45 19535 


Homo sapiens 


Leukotriene B4 omega-hydroxylase 


133 




1261 


gil857022 


Homo sapiens 


leukotriene B4 omega-hydroxylase 


1 ii 
133 


/IO 


1261 


gil8266446 


Homo sapiens 


cytochrome P450, subfamily IVF, 
polypeptide 2 


133 


49 


1262 


gil3363530 


Escherichia coli 
0157:H7 


cell division protein HflB/FtsH 
protease 


fy 




1262 


gi746401 


Escherichia coli 


ATP-binding protein 


79 


26 


1262 


gil46028 


Escherichia coli 


ftsH 


79 


26 


1263 


AAW67859 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 53 clone HBMCL41. 


283 


100 | 


1264 


gil 1066248 


Helix lucorum 


presenilin 


85 


21 


1264 


gi|191 15422| 
ref|NP 5945 
10.11 


Schizosaccharom 
yces pombe 


ribonuclease II RNB family protein; 
dis3-like 


69 


30 


1264 


gi|14720912| 
ref|XP_0382 
04.1| 


Homo sapiens 


similar to Matrin 3 


69 


32 


1265 


gi5757703 


Mus museums 


syntrophin-associated serine-threonine 
protein kinase 


82 


38 


1265 


gi4996035 


Human 
herpesvirus 6 


69.8% identical to U47 gene of strain 
U1102 ofHHV-6 


76 


42 


1265 


gi330951 


Gallid 

herpesvirus 1 


ICP4 


76 


36 


1266 


gi|175 11177| 
reflNP 4933 
24.1| 


Caenorhabditis 
elegans 


ZK1053.3.p 


75 


40 


1266 


gi|17538077| 
reflNP 4951 
59.1| 


Caenorhabditis 
elegans 


ZK1248.2.p 


69 


34 


1267 


gi915540 


Ovis aries 


pregnancy-specific antigen 


85 


25 


1267 


gi6179989 


Capra hircus 


pregnancy-associated glycoprotein-2 


84 


25 


1267 


gi9798658 


Rhinolophus 
ferrumequinum 


pepsinogen A 


80 


23 


1268 


gi|15789526| 
re£|NP 2793 
50.1| 


Halobacterium 
sp. NRC-1 


serine proteinase; HtrA 


69 


30 


1269 


gi9988674 


Influenza A virus 
(A/Swine/Wisco 
nsin/14094/99(H 
3N2)) 


hemagglutinin protein . 

* 


70 


24 


1269 


gi6552676 


Influenza A virus 
(A/Bangkok/1/97 
(H3N2)) 


hemagglutinin 


70 


25 


1269 


gi6552638 


Influenza A virus 
(A/lnnidaa/j i/y 
6(H3N2)) 


hemagglutinin 


70 


24 


1270 


gi3378527 


Zea mays 


anther specific protein 


87 


41 


1270 


AAW15787 


Homo sapiens 


PENN- Human metastasis suppressor 
KiSS-1. 


85 


28 


1270 


gi21410770 


Homo sapiens 


Similar to RDCEN cDNA 1500005K14 
gene 


84 


46 
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ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


0/ 
/O 

Identity 


1271 


gil 335527 


Human 
poliovirus 1 


reading frame VP3 


fj 


JO 


1271 


gi61253 


Human 
poliovirus 1 


polyprotein 


/J 


18 

JO 


1271 


gi|17453412| 
refjXP_0631 
32.1| 


Homo sapiens 


similar to 60S ribosomal protein L7A 
(Surfeit locus protein 3) 


/O 


AO 


1272 


AAU87081 


Homo sapiens 


BRIM Sialic acid-binding Ig-related 
lectin, Siglec-11. 


69 


43 


1272 


AAU87077 


Homo sapiens 


BRIM Sialic acid-binding Ig-related 
lectin, Siglec-BMS-L3d. 


69 


43 


1272 


AAU87076 


Homo sapiens 


BRIM Sialic acid-binding Ig-related 
lectin, Siglec-BMS-L3c. 


69 


43 


1273 


AAA09121_ 
aal 


Homo sapiens 


CURA- Clone 2355875 cDNA 
(update), encodes syncollin homologue. 


72U 


1 nn 
1UU 


1273 


AAY92233 


Homo sapiens 


CURA- Clone 2355875f - syncollin 
homologue. 


720 


100 


1273 


AAB54267 


Homo sapiens 


HUMA- Human pancreatic cancer 
antigen protein sequence SEQ ID 
NO:719. 


*71 ^ 




1274 


gil5559064 


Mus museums 


SNAG1 


10Q 
1V5 




1274 


AAU17435 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protein, Seq ID 1000. 


131 


62 


1274 


AAW99023 


Homo sapiens 


MOUN 17G2 peptide sequence. 


131 


OZ 


1275 


gi|6753732|r 
eflNP 0342 
43.11 


Mus museums 


epidermal growth factor 


65 


30 


1275 


gi|50801|em 
b|CAA2411 
5.11 


Mus museums 


polyprotein 


65 


30 


1275 


gi|20341089| 
reflXP 1093 
85.1| 


Mus musculus 


epidermal growth factor 


65 


30 


1276 


AAM39205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2350. 


44/ 


/o 


1276 


AAM40991 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5922. 


424 


74 


1276 


AAO07159 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 21051. 


401 


75 


1277 


gil3905120 


Mus musculus 


RIKEN cDNA Oo 100 1311 / gene 


1 "XA 




1277 


gil3936283 


Mus musculus 


TRH3 


134 


35 


1277 


AAB92625 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 10921. 


127 


35 


1279 


AAM66940 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27246. 


362 


85 


1279 


AAM54534 


Homo sapiens 


MOLE- Human brain expressed single 

pynn nrnhp pnrnded nrntein SEO ID 

NO: 26639. 


362 


85 


1279 


gi|208153|gb 
|AAA73184. 

1| 


synthetic 
construct 


crystal toxin 


79 


40 


1280 


AAE05187 


Homo sapiens 


INCY- Human drug metabolising 
enzyme (DME-18) protein. 


484 


100 
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No. 
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% 
laenuiy 


1280 


AAU12266 


Homo sapiens 


GETH Human PRO5780 polypeptide 
sequence. 


484 | 


100 


1280 


AAY91631 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 24 SEQ ID 
NO:304. 


484 


100 


1281 


AAH46856_ 
aal 


Homo sapiens 


HUMA- Human serine/threonine 
phosphatase encoding cDNA (clone ID 
HLDOO20). 


238 


100 


1281 


AAG77801 


Homo sapiens 


HUMA- Human HLDOO20 
serine/threonine phosphatase protein 
sequence. 


238 


100 


1281 


AAB85476 


Homo sapiens 


HUMA- Human serine/threonine 
phosphatase (clone ID HLDOO20). 


238 


1 AA 

1UU 


1282 


gi|14762786| 
refpCP 0478 
71.11 


Homo sapiens 


GS2 gene 


*7A 

70 


OA 


1283 


gi3860165 


Arabidopsis 
thaliana 


disease resistance protein RPPl-WsB 


69 


38 


1283 


AAO09033 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 22925. 


68 


38 


1283 


gi6967115 


Arabidopsis 
thaliana 


disease resistance protein homlog 


68 


38 


1285 


gil055252 


Rattus 
norvegicus 


pheromone receptor VN5 


78 


32 


1285 


gi2746733 


Drosophila 
virilis 


circadian clock protein 


73 


26 


1285 


gi2641617 


Drosophila 
virilis 


TIM 


73 


26 


1286 


gi6013135 


Rattus 
norvegicus 


coxsackie-adenovirus-receptor 
homolog 


86 


67 


1286 


AAV50429_ 
aal 


Homo sapiens 


UYNY Human coxsackievirus and Ad2 
and Ad5 receptor (HCAK) cJLjina. 


Q1 
OD 


7^ 
/J 


1286 


AAV28845_ 
aal 


Homo sapiens 


DAND Human coxsackievirus and 
adenovirus receptor encoding DNA. 


OD 


7^ ' 


1287 


AAU83224 


Homo sapiens 


ZYMO Novel secreted protein 
Z930757ulzr. 


A/19 

Oh/. 


1 nn 


1287 


AAY70692 


Homo sapiens 


DAND Human soluble attractin-2. 


84 


54 


1287 


AAY70691 


Homo sapiens 


DAND Human membrane attractin-2. 


8/1 




1288 


AAW70326 


Homo sapiens 


GEMY Secreted protein DU123 1. 


1655 


99 


1288 


ABB12473 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 312. 


D4/ 


/Z 


1288 


gi5689736 


Homo sapiens 


Myopodin protein 


475 


100 


1289 


gi4103543 


Tomato chlorosis 
virus 


heat shock protein 70 


15 


zy 


1289 


gil2247413 


Cristatella 
mucedo 


cytochrome b 


TO 
11 


D\) 


1289 


gi|4103543|g 

ul A AT"i0.17Q 

0.1| 


Tomato chlorosis 

VII Uo 


heat shock protein 70 


71 


zy 


1291 


AAB94128 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 14383. 


520 


98 


1291 


AAY85576 


Homo sapiens 


JANCHs-UNC-53/1 fragment/GFP 
fusion insert of plasmid pGI3 150. 


520 


98 


1291 


AAY85564 


Homo sapiens 


JANC Human homologue ofUNC-53 


520 


98 
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(Hs-UNC-53/1) sequence. 






1292 


AAY01413 


Homo sapiens 


nUMA- oecreteo protein encoaea oy 
gene 31 clone HHBAG64. 


907 


Q7 


1292 


AAY05324 


Homo sapiens 


GEMY Human secreted protein 

ljl67 Z>. 


207 


97 


1292 


gil5 157864 


Agrobacterium 
tumefaciens str. 
C58 (Cereon) 


AGR_C_4816p 


71 


34 


1294 


AAB12146 


Homo sapiens 


PROT- Hydrophobic domain protein 
from clone HP 1 0672 isolated from 
Thymus cells. 


219 


100 


1295 


gi|17228767| 
reflNP_4853 
15.11 


Nostoc sp. PCC 
7120 


probable glycogen phosphorylase 


78 


34 


1295 


gi|10835203| 

/TV TV* /\/\* ^ 

ref|NP 0011 
27.1| 


Homo sapiens 


advanced glycosylation end product- 
specific receptor 


65 i 


58 


1295 


gi|190846|gb 
|AAA03574. 


Homo sapiens 


receptor for advanced glycosylation 
end products 


OJ 


58 

JO 


1296 


gil751 1816 


Homo sapiens 


cj;__ TJTVRTSJ rT\h3 A 111 0019O99 

oimiiar to kjj\J}in cuin/v i i i kjkjolxjz*** 
gene 


xxuo 


99 


1296 


AAB88440 


Homo sapiens 


HELI- Human membrane or secretory 
protem clone ra&KJjzz./.. 


688 


100 


1296 


•TO 1 1 A 1 O 

gi7211438 


: 

Homo sapiens 


golgin-67 




30 


1298 


gil8314436 


Homo sapiens 


Similar to RIKEN cDNA 492 15 1 1C04 
gene 


481 


79 


1298 


gil872546 


Mus musculus 


NIK 


86 


25 


1298 


gi5533305 


Homo sapiens 


somatostatin receptor interacting 
protein splice variant a 


OJ 


70 
z.y 


1299 


gil334643 


Xenopus laevis 


APEG precursor protein 


105 


27 


1299 


gil7428053 


Ralstonia 
solanacearum 


Tvr»/~vr» A T>T T? T% "TO /"YKTT T/"* , T "C A CD T7 

PROBABLE RJLBQNUCLbAbli H 
(RNASE E) PROTEIN 


i oo 


19 


1299 


gi6690017 


Herpesvirus 
papio 


NTR 


OA 

yo 


9*? 


1300 


AAB87346 


Homo sapiens 


HUMA- Human gene 5 encoded 
secreted protem riur icoj, ony uj 


JoO 


74 
/*t 


1300 


AAB44298 


Homo sapiens 


GETH Human PRO706 (UNQ370) 

nrntpin cpnnpnrp <JPO FT) lSlO'^SS 
pit) IC 111 oCt-JllCU^C JDV^ U_/ liVfJOJ. 


586 


74 


1300 


A A Wt 1 '7/!'"> 


Homo sapiens 


VXCIXI XlUIIlail rivU / uu piuiciii 

CP/11 ipiipp 

sccjucuwC 


586 


74 


1301 


gi218572 


Pan troglodytes 


prot GOR 


1344 


62 


1301 


gl24389o 


ran 




1040 


68 


1301 


gil7862570 


Drosophila 
melanogaster 


LD38414p 


486 


45 


1302 


gil3276598 


Homo sapiens 


dJ6 1404.7 (Novel protein) 


260 


28 




gll JJ7 / Out 


TTfimn ^aniens 


dJ616B8.3 (novel gene) 


230 


30 


1302 


AAB56641 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO:1219. 


226 


30 


1303 


gi603989 


Drosophila 
melanogaster 


salivary gland glue protein 


149 


23 


1303 


gil3324584 


Borrelia 
burgdorferi 


LMP1 


129 


17 
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% 

THpntitv 


1303 


gil61956 


Trypanosoma 
cruzi 


surface antigen 


128 


13 


1304 


gil3569248 


Human 

immnnodeficienc 
y virus type 1 


gag protein 


81 


34 


1304 


gi4324832 


Human 

immunodeficienc 
y virus type 1 


gag-pol polyprotein 
• 


oU 




1304 


gil 1691875 


Mus musculus 


ADP-ribosylation factor 1 GTPase 
activating protein 


79 


22 


1305 


AAO06469 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 20361. 


191 


100 


1305 


gi3608368 


Xenopus laevis 


origin recognition complex associated 
protein p81 






1305 


ABB15196 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 3o5i. 


68 


36 


1306 


AAE03657 


Homo sapiens 


INCY- Human extracellular matrix and 
cell adhesion molecule-21 (XMAD- 
21). 


i no 


97 
Z / 


1306 


ABB 11890 


Homo sapiens 


HYSE- Human protocadherin 
riamingo i nomoiogue, oiiv^ jjj 
NO:2260. 


109 


27 


1306 


gl344929o 


Homo sapiens 


JVLtlljrZ 


ioq 


27 I 


1308 


gi9294050 


Arabidopsis 
thaliana 


protein kinase-like protein 


84 


32 


1308 


gil5983765 


Arabidopsis 
thfl1iflT»fl 


AT3g24550/MOB24_8 


84 


32 


1308 


gil3877617 


Arabidopsis 
thaliana 


: : : 

protein kinase-like pro tern 


CA 
54 




1309 


AAU00375 


Homo sapiens 


BERN/ Human stem cell growth factor 
receptor. 


127 


54 


1309 


AAE07145 


Homo sapiens 


SALK Human Kit/stem cell factor 
receptor kinase insert region. 


127 


54 


1309 


gi3236223 


Equus caballus 


tyrosine kinase receptor homolog 


127 


50 


1310 


gi21449343 


Actinosynnema 
pretiosum subsp. 
auranticum 


polyketide synthase 


II 


40 


1310 


gi211 14513 


Xanthomonas 
campestris pv. 
campestris str. 
ATCC 33913 


transcriptional regulator 


IS 


36 


1310 . 


gil33643o4 


Escherichia coli 
0157:H7 


acetylglutamate kinase 


i j 


^o 


1311 


gi20146220 


Oryza sativa 

(japonica 

cultivar-group) 


similar to splicing factor/activator 
protein 




DD 


1311 


gi206712 


Rattus 
nnrvpoimis 


salivary proline-rich protein 


104 


27 


1311 


AAY84592 


Homo sapiens 


UNIW Amino acid sequennce of a 
human artemin polypeptide. 


103 


34 i 


1312 


Ki2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


530 


69 


1312 


gi|10834720| 

gb|AAG237 

90.1IAF258 


Homo sapiens 


PP565 


249 


66 
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587 1 










1312 


gi|13 194728| 
gb|AAK155 
2o.l|AF329 
4D1 1 


Gallus gallus 


pol-like protein ENS-3 


115 


21 


1 1 1 1 


AAWUiJiJ 


— — : : 

Homo sapiens 


onJvJ rtuman l/ijljvIou protein. 


147 


JO 


1 1 1 *3 


giiijyyiu 


Homo sapiens 


jjuujv i ou proiem 


147 
l*r / 


CO 
Jo 


1313 


gil504002 


Homo sapiens 


similar to a human major CRK-binding 
protein uu\-j\.iou. 


111 


43 


1314 


gil2007418 


Mus musculus 


B3 olfactory receptor 


76 


38 


1314 


gil8480290 


Mus musculus 


olfactory receptor MOR260-3 


10 


OS 


1314 


gil2007432 


Mus musculus 


B3 olfactory receptor 


76 


38 


1315 


gi483581 


Mus musculus 


Notch 3 


82 


26 


1315 


gil8159668 


Pyrobaculum 
aerophilum 


paREP2b 


81 


29 


1315 


gi4584086 


Spermatozopsis 
similis 


p2 10 protein 


79 


25 


1316 


AAM71305 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
IDNO:31611. 


422 


98 


1316 


AAM58790 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protem SEQ ED 
NO: 30895. 


422 


98 


1316 


gil49490 


Lactococcus 
lactis 


sucrose-6-phosphate hydrolase 


72 


31 


1317 


gi 1620040 


Paramecium 
bursaria 

Chlorella virus l 


Asp-rich 


72 


28 


1317 


gi3721615 


Cyprinus carpio 


MEF2C 


71 


25 


1317 


gi|9631936|r 
effNP 0487 
.25.11 


Paramecium 
bursaria 

Chlorella virus 1 


Asp-rich 


72 


28 


1318 


gi|21291797| 

gb|EAA039 

42.1| 


Anopheles 
gambiae str. 
PEST 


agCP3974 


74 


35 


1319 


gi2 1306283 


Cnlamydomonas 
reinhardtii 


iron transporter Ftrl 


74 


OA 

30 


1319 


AAB60461 


Homo sapiens 


INCY - Human cell cycle and 
proliferation protein CCYPR-9, SEQ 
ID JNU:9. 


73 


n 


1319 


gi6013155 


Homo sapiens 


p35srj 


73 


33 


1320 


gi9717245 


Mus musculus 


cytoplasmic dynein heavy chain 


430 


94 


1320 


gi402528 


Rattus 
norvegicus 


cytoplasmic dynein heavy chain 


iOA 

430 


94 


1320 


gi294543 


Rattus 
norvegicus 


dynein heavy chain 


430 


94 


1323 


gi|l7221411| 
emD|L//vu iz 
639.ll 


Burkholderia 
cepacia 


kdo transferase 


70 


34 


1324 


gil698601 


Cricetulus 
griseus 


beta-l,6-N- 

acetylglucosaminyltransferase 


440 


38 


1324 


gi349091 


Rattus 
norvegicus 


N-acetylglucosaminyltransferase V 


438 


43 


1324 


gil 8997007 


Mus musculus 


N-acetylglucosaminyltransferase V 


438 


43 
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1325 


AAM70545 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 30851. 


115 


47 


1325 


AAM58098 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 30203. 


115 


47 


1325 


AAM72994 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 33300. 


111 


28 


1326 


gil 2724969 


Lactococcus 
lactis subsp. 
lactis 


phenolic acid decarboxylase 


77 


46 


1327 


AAB53097 


Homo sapiens 


GETH Human angiogenesis-associated 
protein PR01246, SEQ ID NO:167. 


372 


63 


1327 


AAU12416 


Homo sapiens 


GETH Human PR01246 polypeptide 
sequence. 


372 


63 


1327 


AAY99377 


Homo sapiens 


GETH Human PR01246 (UNQ630) 
amino acid sequence SEQ ID NO: 132. 


372 


63 


1328 


gi6014505 


Hepatitis GB 
virus B 


polyprotein 


76 


43 


1328 


gi765145 


Hepatitis GB 
virus B 


polypeptide 


68 


41 


1328 


gi|20544059| 
ref]XP 0862 
20.4| 


Homo sapiens 


similar to U4/U6-associated RNA 
splicing foctor 


294 


100 


1329 


AAV42689_ 
aal 


Homo sapiens 


SIBI- DNA encoding human calcium 
channel alpha-2 subunit 


158 


91 


1329 


AAQ84667_ 
aal 


Homo sapiens 


SALK Human neuronal calcium 
channel subunit alpha 2c. 


158 


91 


1329 


AAQ84664_ 
aal 


Homo sapiens 


SALK Human neuronal calcium 
channel subunit alpha 2b. 


158 


91 


1330 


gil9923 


Nicotiana 
tabacum 


pistil extensin like protein, partial CDS 


71 


38 


1330 


gi|144429|gb 
(AAA56792. 
11 


Cellulomonas 
fimi 


beta-l,4-xylanase 


67 


30 


1331 


gi2388676 


Mytilus edulis 


precollagen P 


85 


35 


1331 


gil7862044 


Drosophila 
melanogaster 


LD06016p 


75 


30 | 


1331 


gil3879780 


Mycobacterium 

tuberculosis 

CDC1551 


PE_PGRS family protein 


74 


30 


1333 


AAO00015 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 13907. 


442 


61 


1333 


AAB82479 


Homo sapiens 


ZYMO Human RING finger protein 
Zapop2. 


81 


31 


1333 


gi20975274 


Homo sapiens 


skeletrophin 


81 


31 


1334 


A t>t> 1 1 ni n 

ABB11819 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ IDNO:2189. 


JO/ 




1334 


AAW80398 


Homo sapiens 


GEMY A secreted protein encoded by 
clone cwl543 3. 


130 


67 


1334 


gi5081693 


Samanea saman 


pulvinus inward-rectifying channel 
SPICK2 


70 


34 


1335 


ABB89969 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 


142 


96 
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No. 


Species 
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Score 


% 

THpfitifv 

XUCllLIljr 








NO 2345. 






1335 


AAB38385 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 18 clone HTLEJ24. 




OA 

yo 


1335 


AAB38338 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 18 clone HTLFE57. 


142 


96 


1336 


gi|14590195| 
reflNP 1422 
60.1| 


Pyrococcus 
horikoshii 


asparaginyl-tRNA synthetase 


70 


in 
37 


1337 


gi3879419 


Caenorhabditis 
elegans 


contains similarity to Pfam domain: 
PF00102 (Protein-tyrosine 
phosphatase), Score=51.6, E- 
value=L8e-14,N=l 


69 


29 


1337 


gi| 17563828| 
reflNP 5059 
65.1| 


Caenorhabditis 
elegans 


protein tyrosine phosphatase 


69 


29 


1338 


gi|2072960|g 

b|AAC5126 

8.1| 


Homo sapiens 


p40 


138 


33 


1338 


gi|4185940|e 
mb|CAA768 
80.1| 


Human 
endogenous 
retrovirus K 


env protein 


124 


75 


1338 


gi|757872|e 

mb|CAA577 

23.1| 


Human 

endogenous 

retrovirus 


env 


124 


75 


1340 


gjl491979 


Molluscum 
contagiosum 
virus subtype 1 


MC036R 


78 


11 

33 


1340 


gi|9628968|r 
ef]NP 0439 
87.1| 


Molluscum 

contagiosum 

virus 


MC036R 


78 


33 


1341 


gil8676514 


Homo sapiens 


FU00154 protem 


1 jOU 


i fin 

1UU 


1341 


AAB84252 


Homo sapiens 


HUMA- Amino acid sequence of a 
human cytokine receptor-like protein. 


572 


63 


1341 


AAB84251 


Homo sapiens 


HUMA- Human cytokine receptor-like 
protein fragment. 


572 


03 


1342 


AAY27757 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene No. 47. 


152 


71 


1342 


AAB27551 


Homo sapiens 


MYRI- Human tumour suppressor 
BRG1 encoded by cDNA mutated at 
base 1705. 


77 


32 


1342 


AAB27550 


Homo sapiens 


MYRI- Human tumour suppressor 
BRCrl protem irom cell unes uu 14 D 
and N CJ-ii 1 3UU. 


11 




1344 


gi21464394 


Drosophila 
melanogaster 


RE18651p 


78 


26 


1344 


AAM39065 


Homo sapiens 


HYSE- Human polypeptide SEQ ED 
NO 2210. 


77 


21 


1 1AA 


goJoZz/U 


nomo sapiens 


sono pruicm 


77 


21 


1345 


gi2202 


Canis sp. 


Clox 


135 


37 


1345 


gi3879551 


Caenorhabditis 
elegans 


contains similarity to Pfam domain: 
PF01391 (Collagen triple helix repeat 
(20 copies)), Score=56.4, E-value=2e- 
13, N=2; PF01484 (Nematode cuticle 
collagen N-terminal domain), 


125 


33 
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ID 
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Species 


Description 


Score 


% 

THpntftv 

AUclllllj 








ocore— o / .z, Jtir— value— i . i e-zz, in— i 






1345 


gil 58695 


Drosophila 
melanogaster 


tropomyosin isoform 33 (9C) 


110 




1346 


gi7862077 


Giardia 
intestinalis 


3-hydroxy-3-methylglutaryl-coenzyme 
a reductase 


90 


26 


1346 


gil0986l5 


Mycoplasma 
pneumoniae 


adhesin-related 30 kDa protein 


87 


23 


1346 


gi20380058 


Homo sapiens 


Similar to PRAM-1 protein 


84 


28 


1347 


gil3905302 


Mus musculns 


Similar to A 1 rase, class 11, type VA 


/Jo 


OJ 


1347 


gil7862322 


Drosophila 
melanogaster 


LD22U9p 


633 


72 


1347 


AAM25271 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NU:/oo. 


572 


100 


1348 


gi4563l9 


Bacteriophage 
FC1 


74kDa protein 


75 


33 


1348 


gil524H5 


Lycopersicon 
esculentum 


subtilisin-like endoprotease 


73 


28 


1348 


gi4200334 


Lycopersicon 
esculentum 


P69A protein 


73 


28 


1349 


gi21391988 


Drosophila 
melanogaster 


TTT 

HL08052p 


no 

lo 


31 


1349 


gi20148339 


Arabidopsis 
thaliana 


cyclin delta-3 


77 


25 


1349 


gi|17647607| 
refjNP 5234 
23.1| 


Drosophila 
melanogaster 


maroon-like; bronzy; section 5 


78 


31 


1351 


gi 18676524 


Homo sapiens 


t""t ta a 1 cn x — * 

FLJ00159 protein 


1 

104 


DZ 


1351 I 


gi2 1392066 


Drosophila 
melanogaster 


RE04357p 


139 


34 


1351 


AAB92637 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 10953. 


81 


43 


1352 


gil9071965 


Aspergillus 
oryzae 


chitin synthase 


79 


28 


1352 


gil7945592 


Drosophila 
melanogaster 


RE26660p 


78 


41 


1352 


gil6184663 


Drosophila 
melanogaster 


LD28370p 


74 


22 


1353 


gi|l 1037117| 
gb|AAG274 

Of 1 1 A T? 1 Ay* 

85.l|AF194 
537 l 


Homo sapiens 


XT A PI 1 

NAU13 


3\Jt 


Oj 


1353 


gi|l335205|e 
mb|CAA364 

OA 1 1 

80.1| 


Homo sapiens 


ORF11 




00 


1354 


gil388166 


Drosophila 
melanogaster 


Bowel 


80 


32 


1354 


gi!5553187 


Scykorhinus 

k^Cl 1 1 1 w L11C1 


homeodomain protein Otxl 


*70 


00 

zz 


1354 


AAY85573 


Homo sapiens 


JANCHs-UNC-53/3 fiagment/GFP 
fusion insert of plasmid pGI3303. 


78 


26 


1358 


gi|21288288| 

gb|EAA006 

09.1| 


Anopheles 
gambiae str. 
PEST 


agCP9766 


71 


30 


1358 


gi|17465558| 


Homo sapiens 


similar to mucin 


68 


36 
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% 
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reflXP 0698 
88.1| 










1359 


gi|21302892| 

gb|EAA150 

37.1| 


Anopheles 
gambiae str. 
PEST 


agCP5020 


70 


31 


1361 


gil 5080686 


Lentinula edodes 


CDC5 


79 


OiC 

2o 


1361 


gi495516 


Plasmodium 
vivax 


circumsporozoite protein 


77 


31 


1361 


gi21070569 


Dictyostelium 
discoideum 


VSAE2 (FRAGMENT). 3/101 


76 


31 


1362 


gi8953400 


Arabidopsis 
thaliana 


1-D-deoxyxylulose 5-phosphate 
synthase-like protein 


73 


23 


1362 


gi|15239030| 
ref|NP 1966 
99.1| " 


Arabidopsis 
thaliana 


1-D-deoxyxylulose 5-phosphate 
synthase - like protein 


73 


oo 

23 


1363 


g i2444430 


Xenopus laevis 


deacetylase 


327 


0 1 
51 


1363 


gi602098 


Xenopus laevis ] 


yeast RPD3 homologue 


324 


80 


1363 


AAB49954 


Homo sapiens 


METH- Human histone deacetylase 
HDAC-1. 


323 


80 


1364 


AAM69686 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29992. 


418 


55 


1364 


AAM57281 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 29386. 


418 


55 


1364 


gi|1780971|e 
mb|CAA714 
16.1| 


Hitman 
endogenous 
retrovirus K 


gag protein 


172 


37 


1365 


gi437084 


Gallus gallus 


vitamin D3 hydroxylase associated 
protein 


510 


A 1 

41 


1365 


gi2149156 


Homo sapiens 


fatty acid amide hydrolase 


477 


38 


1365 


AAW57783 


Homo sapiens 


SCRI Human fatty acid amide 
hydrolase. 


468 


38 


1366 


gi35 10695 


Homo sapiens 


DNA polymerase theta 


77 


21 


1366 


gi309132 


Mus musculus 


calnexin 


72 


OO 

22 


1366 


gil5214567 


Mus musculus 


Similar to calnexin 


72 


oo 
22 


1367 


gi|17508849| 
reflNP_4914 
26.1| 


Caenorhabditis 
elegans 


helicase 


73 


40 


1368 


gi5457567 


Pyrococcus 
abyssi 


Na+/H+ antiporter (napA-l) 


76 


33 


1368 


gi8247211 


Candida albicans 


She9 protein 


69 


31 


1368 


gi|14590079| 
ref|NP_1421 
43.1| 


Pyrococcus 
horikoshii 


Na(+)/H(+) antiporter 


76 


OA 

30 


1369 


gil7644260 


Homo sapiens 


bB206I2Ll (ATPase, Class VI, type 
HC) 


305 


98 


1369 


AAU1420U 


Homo sapiens 


UN L/ 1 - Jiuman uansporier ana luii 
channel TRICH-17. 


166 


50 


1369 


gi5080816 


Arabidopsis 

thaliana 


Putative ATPase 


166 


49 


1370 


gi|18573281| 
ref|XP 0959 
33.1| 


Homo sapiens 


similar to 40S ribosomal protein S3A 


70 


38 
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UvUI V 


% 

/o 

Identity 


1372 


gi6683562 


Mus musculus 


heparan sulfate 6-sulfotransferase 3 


886 


91 


1372 


gi6683558 


Mus musculus 


heparan sulfate 6-sulfotransferase 2 


265 


72 


1372 


ABL39900_ 
aal 


Homo sapiens 


SEGK Human HS6ST2v encoding 
cDNA SEQ ID NO: 1. 


262 


71 


1373 


gi|20882231| 
ref|XP 1392 
03.1| 


Mus musculus 


similar to LIM domain onlv 7 






1373 


gi|20302988| 
gb|AAM189 
48.1|AF498 
989 1 


Medicago sativa 


nodule-specific glycine-rich protein 3 


72 


26 


1373 


gi|9965267|g 

b|AAG1000 

8.1| 


infectious 
hypodermal and 
hematopoietic 
necrosis virus 


non-structural orotein 1 




OA i 


1374 


gi3355835 


Rhizobium etli 


RBSK 


78 




1374 


gi7453560 


Polyangium 
cellulosum 


epoD 


73 


28 


1374 


gil749684 


Schizosaccharom 
yces pombe 


similar to Saccharomyces cerevisiae 
porphobilinogen deaminase SWISS- 
PROT Accession Number P28789 


72 


28 


1375 


gil6973455 


Danio rerio 


beta-3-galactosyltransferase 


1UJU 




1375 


AAB24035 


Homo sapiens 


GETH Human PR04397 protein 
sequence SEQ ID N042 


725 


46 


1375 


AAB88404 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0 159. 


709 


43 


1376 


gi7668 


Drosophila 
melanogaster 


bsg25D protein 


73 


33 


1376 


gi20177037 


Drosophila 
melanogaster 


LD21844p 


73 


33 


1376 


gil353669 


Caenorhabditis 
elegans 


UNC-24 


69 


43 


1379 


AAS16182_ 
aal 


Homo sapiens 


GENA- Human apolipoprotein CI 
(APOCl)DNA. 


245 


67 


1379 


AAU10534 


Homo sapiens 


GENA- Human apolipoprotein CI 
(APOC1) polypeptide 


245 


67 


1379 


AAS16825_ 
aal 


Homo sapiens 


GENA- Human apolipoprotein CI 
(APOC1) DNA coding sequence. 


245 


67 


1380 


AAY36290 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 67. 


177 


74 


1380 


gil6551305 


Tatianyx 
arnacites 


DNA-directed RNA polymerase beta' 
subunit 2 


71 


jo 


1380 


gi3411013 


Candida albicans 


protein mannosyltransferase 1 


68 


3S 


1381 


AAM80132 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3778. 


173 


66 


1381 


gi4731867 


Dictyostelium 
discoideum 


sterol glucosyltransferase 


107 


30 


1381 


AAB74726 


Homo sapiens 


INCY- Human membrane associated 
protein MEMAP-32. 


89 


41 


1382 


AAB62100 


Homo sapiens 


WIST- Human bridging integrator-2 
(Bin2) protein. 


78 


27 


1382 


gi6527168 


Homo sapiens 


breast cancer associated protein 
BRAP1 


78 


27 


1382 


gi5852834 


Homo sapiens 


bridging integrator-2 


78 | 27 
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1383 


gi7670050 


Xenopus laevis 


type I collagen alpha 1 


92 


1*7 
LI 


1383 


AAO01606 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 15498. 


85 


29 


1383 


gil7738485 


Agrobacterium 
tumefaciens str. 
C58 (U. 
Washington) 


biopolymer transport protein 


85 


28 


1384 


gi20451261 


Caenorhabditis 
elegans 


C. elegans GCY-17 protein 
(corresponding sequence W03F1 1.2) 


71 


26 


1384 


gi2665714 


Agrobacterium 
tumefaciens 


moaC 


71 


29 


1384 


gi|20864452| 
refjXP 1500 
76.1| 


Mus musculus 


RIKEN cDNA 2410018E23 


130 


59 


1385 


AAY94938 


Homo sapiens 


GEMY Human secreted protein clone 
ye78_l protem sequence SEQ ID 
NO:82. 


103 


O C 

25 


1385 


gjl2831176 


Agelaius 
phoeniceus 


gamma filamin protem 


yo 




1385 


AAU81998 


Homo sapiens 


INCY- Human secreted protein 
SECP24. 


0*7 


0*7 


1386 


gil0440468 


Homo sapiens 


FLJ00070 protein 




A 1 


1386 


gilll36912 


Danio rerio 


RPTP-alpha protein 


94 


32 


1386 


gi20377083 


Homo sapiens 


p78 


92 


36 


1387 


AAM40810 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5741. 


190 


59 


1387 


AAM39024 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2169. 


190 


59 


1387 


gil 5080474 


Homo sapiens 


Similar to RIKEN cDNA 170002301 1 
gene 


190 


59 


1388 


gil2802591 


Bovine 
herpesvirus 4 


tegument protein 


82 


30 


1388 


gi950226 


Saccharomyces 
cerevisiae 


Trf4p 


73 


26 


1388 


gi|13095641| 
ref]NP 0765 
56.1| 


Bovine 
herpesvirus 4 


tegument protein 


82 


30 


1389 


AAI67224_ 
aal 


Homo sapiens 


CORI- B5 1 IS cDNA sequence. 


o ^*o 

363 


1 f\r\ 

100 


1389 


AAF85500_ 
aal 


Homo sapiens 


EOSB- Nucleotide sequence of a 
human breast cancer protein designated 
BCH1. 


"1 /T 1 

363 


1 Art 

100 


1389 


AAA54120_ 
aal 


Homo sapiens 


EOSB- Breast cancer protein BCH1 
coding sequence. 


363 


100 


1390 


gil84653 


Homo sapiens 


IFN-alpha responsive transcription 
factor 


Til 

74 


30 


1390 


gi|2580453|g 

D|AABo2J3 

6.1| 


Xenopus laevis 


Xbap 


68 


47 


1391 


AAB88456 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0246. 


85 


52 


1391 


AAB62392 


Homo sapiens 


LEXI- Human LDL receptor family 
protein (LDLP). 


85 


52 


1392 


ABB12009 


Homo sapiens 


HYSE- Human RAMP 1 homologue, 


90 


100 
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SEOIDNO:2379. 






1392 


gi3171910 


Homo sapiens 


RAMPl 


on 


inn 


1392 


gil2653551 


Homo sapiens 


receptor (calcitonin) activity modifying 
protein l 


90 


100 


1394 


gi4467343 


Drosophila 
melanogaster 


EG:l40Gll.l 


70 


27 


1394 


gi6018879 


Drosophila 
melanogaster 


BACN4L24.d 


70 


27 


1394 


gil 57993 


Drosophila 
melanogaster 


developmental protein 


70 


27 


1395 


gi4928919 


Arabidopsis 
thaliana 


zinc finger protein 2 


86 


26 


1395 


gi2702272 


Arabidopsis 
thaliana 


expressed protein 


60 


ZD 


1396 


AAM25276 


Homo sapiens 


HYSE- Human protein sequence SEQ 
IDNO:791. 


729 


93 


1396 


AAE14340 


Homo sapiens 


INCY- Human protease PRTS-5 
protein. 


528 


33 


1396 


AAB47561 


Homo sapiens 


INCY- Protease PRTS-3. 






1397 


gil 8369843 


Infectious 
salmon anemia 
virus 


P6 


89 


40 


1397 


gi4092530 


Infectious 
salmon anemia 
virus 


NSl protein 




io 
5y 


1397 


gil4009648 


Infectious 
salmon anemia 
virus 


NSl 


87 


39 


1398 


AAW63707 


Homo sapiens 


UYOR- Human hbJsSl protein. 


JD 1 


01 

y i 


1398 


gil575663 


Rattus x 
norvegicus 


calcium-activated potassium channel 
rSK2 


DDL 




1398 


gil5082148 


Homo sapiens 


small-conductance calcium-activated 
potassium channel 


331 


91 


1399 


AAB01381 


Homo sapiens 


INCY- Neuron-associated protein. 


1653 


68 


1399 


gil8157547 


Mus musculus 


pecanex-like 3 


1620 


66 


1399 


gi6650377 


Mus musculus 


pecanex l 


1277 




1400 


gi|20887681| 
reflXP_1405 
75.1| 


Mus musculus 


similar to melastatin l 


468 


91 


1400 


gi|3243075|g 

b|AAC8000 

0.1| 


Homo sapiens 


melastatin 1 


355 


75 


1400 


gi|20552333| 
ref)XP_0076 
62.91 


Homo sapiens 


similar to melastatin 1 


Off 

355 


/D 


1401 


AAU15955 


Homo sapiens 


HUMA- Human novel secreted protein, 
Seq ID 908. 


Sol 


09 


1 >ini 




XJ.UU1U bapiClla 


PTTST RE -nrotein kinase aloha SV9 
isoform 


95 


24 


1401 


gil517914 


Homo sapiens 


monocytic leukaemia zinc finger 
protein 


91 


28 


1402 


gil289326 


Mus musculus 


ROR-alpha 1 


84 


25 


1402 


gi530878 


Chlamydomonas 
eugametos 


amino acid feature: N-glycosylation 
sites, aa 41 43, 46 .. 48, 51 .. 53, 72 .. 


79 


32 
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% 
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AUwll tl *J 








/4, 1U/ .. 1ZO 1JU, 1JZ 1->H, 

158 160, 163 165; amino acid 
feature: Rod protein domain, aa 169 .. 
jhu, amino acia leaiuic. giuuuuu. 
protein domain, aa 32 168 






1402 


gi220763 


Rattus 
norvegicus 


HES-3 factor 


79 


52 


1403 


gi|20479430| 
reflXPJ 149 
55. 1| 


Homo sapiens 


similar to olfactory receptor MOR23 1- 
1 


71 


32 


1403 


gi|20480897| 
ref|XP_1150 

1 A 11 

14.1| 


Homo sapiens 


nimilor tr* rxlfantixnr rPf^pntnT A/TOT? 7^4— 

Slinxiar 10 oiiociory icuopiui ivi\»/i\-t.j*T 
3 


71 


32 


1 Af\A 

1404 


AAA OOC/1 Q 

AAAooD4o_ 

aal 


nomo sapiens 


<5MTK Human CASB616 cDNA 


89 


100 


1404 


AAB19591 


Homo sapiens 


SMIK Human CASB616. 


89 


100 


1404 


glllOOHO 


Homo sapiens 


»»t*/^f'/3'iti_f~\rroci"np ViTiaQP 
prUlvlll IjXUalllv JVLLUlov 


89 


100 


1405 


gi42Q6753 


Oryctolagus 
cuniculus 


Ti/\m*»nr1r\TYn»'iTi-f*r»Tif'5iinTnO' "nrntftiTi 


74 


24 


1 A f\C 

1405 


•10/1/1 CO c^ 

gil3445253 


Mus musculus 


rwrihnri rrnr^^-liVt* nrntein 1 
\JL pLUxLl vJ JJ1-J /— jxiva* jjavjiasui i 


72 


33 


1405 


gi3080552 


Mus musculus 


Hoxa-9 


71 


50 


1406 


AAM50585 


Homo sapiens 


IN loo £>emgn prosiauc nypeipidMa 
associated protein JT460914. 


325 


100 


1406 


gil 803 1947 


Homo sapiens 


OxJk^O DOX piOlCUI /\OJ3"J 


325 


100 


1406 


AAU20593 


Homo sapiens 


HUMA- Human secreted protein, Seq 

UL/ INO JOJ. 


316 


100 


1407 


AAU83222 


Homo sapiens 


ZYMO Novel secreted protein 


895 


97 


1407 


AAY02712 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 63 clone HBJFV28. 


91 


56 


1407 


AAO00641 


Homo sapiens 


rlioii- xiuman poiypepuae onv^ lu 
NO 14533. 




64 


1408 


ABB17944 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide ox>v rsv^i ooui. 


81 


53 


1408 


AAM77906 


Homo sapiens 


MOLE- Human bone marrow 
expressed proDc cxicuucu pujicui oi^y 
ID NO: 38212. 


72 


40 


1408 


AAM65199 


Homo sapiens 


iyiKjLtEt- XlUIDall Ulalll CApJLCoovU oiiigic 

exon probe encoded protein SEQ ID 

"MO- ITXClA 
IN w. D I J\)H. 


72 


40 


1409 


gi5230847 


Vitreoscilla sp. 
CI 


glutamine synthetase homolog 


68 


33 


1409 


IOC 1CTJ/ 

gi85 15736 


Drosophila 
melanogaster 


nignwire 


67 


35 


1409 


gB 138797 


Sulfolobus 
shibatae 


Ssh7b 


65 


48 


1410 


AAW2330y 


Homo sapiens 


■RTTT- Human Wpttipt 1 *! QvnHrnmp ^VS-2 

protein. 


151 


96 


1410 


gil913785 


Homo sapiens 


Rep-8 


151 


96 


1410 


gil 8089098 


Homo sapiens 


reproduction 8 


151 


96 


1411 


gi|21297468| 

gb|EAA096 

13.11 


Anopheles 
gambiae str. 
PEST 


agCP15537 


166 


56 


1411 


eil20983200| 


Mus musculus 


RIKENcDNA 1810030007 


73 


24 
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/o 

Identity 




reflXPJ358 
12.1| 










1412 


gi532572 


Hordeum 
vulgare 


lipoxygenase 1 

— . _ 


82 


28 


1412 


gi945419 


Mus musculus 


hepatoma derived growth factor 
(HDGF) 


77 


■*s 
jj 


1412 


gil7932895 


stork hepatitis B 
virus 


preC/core antigen 


77 


26 


1413 


gi2370143 


Homo sapiens 


immunoglobulin-like domain- 
containing 1 


169 


42 


1413 


gi2645890 


Homo sapiens 


IGSF1 


loy 


HZ 


1413 


AAB40232 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 46 SEQ ID 
NO: 142. 


162 


40 


1414 


gi2 12043 14 


Staphylococcus 
aureus subsp. 
aureus MW2 


proline-tRNA ligase 


78 


32 


1414 


gil4247033 


Staphylococcus 
aureus subsp. 
aureus Mu50 


proline-tRNA ligase 


78 


32 


1414 


gil3701063 


Staphylococcus 
aureus subsp. 
aureus N3 15 


proline-tRNA ligase 

_ 


72 
to 


10 


1415 


gi9948469 


Pseudomonas 
aeruginosa 


probable non-ribosomal peptide 
synthetase 


78 
/o 


^1 

D I 


1415 


AAE19251 


Homo sapiens 


BIOI- SOS1 protein sequence from 
PS462. 


75 


23 


1415 


AAU84311 


Homo sapiens 


BAAK/ Protein ABCB2 differentially 
expressed in breast cancer tissue. 


/4 


in 


1416 


gil 86767 10 


Homo sapiens 


FU00254 protein 


623 


75 


1416 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 




oy 


1416 


gi|18676710| 
dbj|BAB850 
07.1| 


Homo sapiens 


FU00254 protein 


623 


75 


1417 


AAR85785 


Homo sapiens 


UYNY Human GRB-10. 


77 


32 


1417 


gi841210 


Mus musculus 


growth factor receptor binding protein 
GrblO 


/ / 


10 


1417 


AAM90963 


Homo sapiens 


TTTflk £ A TT . 

HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NU:1oDjo. 






1419 


AAM79990 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3636. 


82 


100 


1419 


AAM79006 


Homo sapiens 


HYSE- Human protein SEQ ID NO 

1 ceo 

1668. 


82 


100 


1419 


AAR28494 


Homo sapiens 


XIAM/ Sequence encoded by the 
CAMPATH-1 antigen cDNA. 


82 


100 


1420 


AAU01383 


Homo sapiens 


MILL- Human TANGO 499 form 2, 
variant 1 amino acid seauence 


828 


73 


1420 


AAU01382 


Homo sapiens 


MILL- Human TANGO 499 form 2, 
variant 4 amino acid sequence. 


828 


73 


1420 


AAU01380 


Homo sapiens 


MILL- Human TANGO 499 form 2, 
amino acid sequence. 


828 


73 


1421 


gi!9069609 


Encephalitozoon 
cuniculi 


PROTEASOME REGULATORY 
SUBUNIT YTA6 OF THE AAA 


76 


26 
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% 
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"C A \ATf V C\V A TP A QPQ 






1422 


AAM66177 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 

JJJ JNU. ZOhod. 


199 


72 


1422 


AAM53791 


; 

Homo sapiens 


\ A T7 T-T>i-r*-»or» Virgin ovnrACC^H c i n ft I f 

iVlUjLii- nuiuHii Drain exprebbcu kingic 
exon probe encoded protein SEQ ID 
JNU. ZDoyo. 




72 


1422 


AAM68472 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 

rr\ XTW OQT7C 
ID JNU: /0//O. 


176 


81 


1423 


gil800227 


Oryza sativa 


Bowman-Birk proteinase inhibitor 


74 


34 


1423 


gil0141005 


San Miguel sea 
lion virus 


non-structural polyprotein 






1423 


gi|17490177| 
ref]XP 0623 
00.11 


Homo sapiens 


similar to RING ringer protein 18 
(Testis-specific ring-finger protein) 


76 


28 


1424 


gi461336 


Pyrenomonas 
salina 


hsp70 , 


75 


29 


1424 


gil3880037 


Mycobacterium 

tuberculosis 

CDC1551 


membrane protein, MmpL family 


75 


24 


1424 


gil449306 


Mycobacterium 

tuberculosis 

H37KV 


mmpL2 


/J 


OA 


1425 


gil5600 


Enterobacteria 
phage I7 


gene 7.3, host range 


79 


30 


1425 


gil6l98065 


Drosophila 
melanogaster 


LD28477p 


77 


30 


1425 


gill8700l2 


Drosophila 
melanogaster 


xnp/atr-x DNA helicase 


77 


30 


1426 


gil6l85397 


Drosophila 
melanogaster 


LD39815p 


204 


44 


1426 


gi2244793 


Arabidopsis 
thaliana 


disease resistance N like protein 


86 


30 


1426 


AAU84280 


Homo sapiens 


BGHM Human endometrial cancer 
reiaiea protein, ixcivv^i . 


77 


26 


1427 


AAY36302 


Homo sapiens 


HUMA- Human secreted protein 

a/mm ilAllfl / j Klf /tOUO #Q 

encoded oy gene 


183 


79 


1427 


AAB88359 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0087. 


178 


80 


1427 


AAM41635 


Homo sapiens 


rlioxi- riuman poiypepuae onv^ lu 
NO 6566. 


1 78 
I/O 


ov 


1428 


AAU82008 


Homo sapiens 


1NCY- Human secretea proteui 
SECP34. 


114 




1428 


AAB32391 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 21 SEQ ID 
NO:77. 


1 14 
1 In 


HA 


1428 


A/l A V/OJvv 


TTomo ^anien<5 


FTBR- Human collagen IX alpha-3 
chain protein. 


74 


45 


1429 


gi2792523 


Ralstonia 
solanacearum 


alternative RNA sigma factor RpoS 


69 


30 


1429 


gil742822l 


Ralstonia 
solanacearum 


RNA POLYMERASE SIGMA S 
(SIGMA-38) FACTOR 
TRANSCRIPTION REGULATOR 


69 


33 
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PROTEIN 






1429 


gi|5032313|r 
ef]NP 0040 
14.1| 


Homo sapiens 


dystrophin Dpl40bc isoform; 
Dystrophin (muscular dystrophy, 
Duchenne and Becker types) 


73 


26 


1433 


gi9954445 


Rattus 
norvegicus 


TEMO 


171 


62 


1433 


gil4030260 


maize rayado 
fino virus 


polyprotein 


79 


32 


1433 


AAB95656 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:18419. 


77 


36 


1434 


AAR04212 


Homo sapiens 


CALB- Human 32K alveolar surfactant 
protein. 


391 


43 


1434 


AAP60661 


Homo sapiens 


KUSH/ Genomic sequence of human 
alveolar surfactant protein 
(hASP)encoded by genomic DNA. 


386 


43 


1434 


AAB58135 


Homo sapiens 


ROSE/ Lung cancer associated 
polypeptide sequence SEQ ID 473. 


366 


42 


1435 


gil7224904 


Mus musculus 


immunoglobulin superfamily member 9 


180 


48 


1435 


gi20988778 


Homo sapiens 


Similar to immunoglobulin 
superfamily, member 9 


173 


53 


1435 


gil4149050 


Drosophila 
melanogaster 


turtle protein, isoform 4 


114 


36 


1436 


gil465855 


Caenorhabditis 
elegans 


C. elegans PQN-57 protein 
(corresponding sequence R09F10.7) 


85 


23 


1436 


gil465856 


Caenorhabditis 
elegans 


C. elegans PQN-56 protein 
(corresponding sequence R09F10.2) 


85 


23 


1436 


gil7864717 


Mus musculus 


hornerin 


83 


26 


1437 


gi|21292574| 

gb|EAA047 

19.1| 


Anopheles 
gambiae str. 
PEST 


agCP3449 


66 


33 


1438 


ABB10160 


Homo sapiens 


HUMA- Human cDNA SEQ ID NO: 
468. 


166 


62 


1438 


gi9657279 


Vibrio cholerae 


aspartokinase II/homoserine 
dehydrogenase, methionine-sensitive 


71 


28 


1439 


gi4582571 


Gallus gallus 


Hyperion protein, 419 kD isoform 


75 


24 


1439 


gil3165 


Oenothera 
biennis 


ATPase alpha-subunit (aa 1-511) 


72 


26 


1439 


gi903838 


Oenothera 
berteriana 


F-l-ATPase alpha subunit 


72 


26 


1440 


gi4558758 


Homo sapiens 


testis-specific chromodomain Y-like 
protein 


233 


62 


1440 


gi4558762 


Mus musculus 


testis-specific chromodomain Y-like 
protein 


231 


36 


1440 


gi3342716 


Homo sapiens 


testis-specific ChromoDomain Y 
isoform 1 


195 


36 


1441 


gil55627 


Acanthamoeba 
castellanii 


myosin I heavy chain 


118 


42 


1 A A 1 

1441 


gu3093370 


Mycobacterium 
leprae 


initiation factor IF-2 


110 


55 


1441 


AAY20289 


Homo sapiens 


UYRO- Human apolipoprotein E 
mutant protein fragment 5. 


114 


39 


1442 


gi2253707 


Mus musculus 


Daxx 


84 


36 


1442 


gil934970 


Plasmodium 
falciparum 


AARP1 protein 


79 


65 
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% 

Hie ii u ty 


1442 


gi4050098 


Mus musculus 


Fas-binding protein 


/O 




1443 


gi2425111 


Dictyostelium 
discoideum 


ZipA 


90 


26 


1443 


AAY06119 


Homo sapiens 


HARD Human CIITA interacting 
protem 104 (CIP104). 


88 


26 


1443 


gi5420387 


Leishmania 
major 


proteophosphoglycan 


86 


21 


1444 


gi893355 


Acinetobacter 
baumannii 


L-2,4-diaminobutyrate decarboxylase 


77 


26 


1445 


ABB55744 


Homo sapiens 


FECH/ Human polypeptide SEQ ID 
NO 94. 


135 


47 


1445 


AAU39035 


Homo sapiens 


GEMY Human secreted protein 
nh328 5. 


135 


47 


1445 


AAY28679 


Homo sapiens 


GEMY Human nh328_5 secreted 
protein. 


135 


47 


1446 


gil9744390 


Homo sapiens 


retinoic acid inducible in 
neuroblastoma cells RAINBla 


247 


54 


1446 


gil9744388 


Homo sapiens 


retinoic acid inducible in 
neuroblastoma cells RA1NB1 


247 


54 


1446 


AAY85565 


Homo sapiens 


JANC Human homologue of UNC-53 
(Hs-UNC-53/2) sequence. 


240 


52 


1447 


AAU19716 


Homo sapiens 


HUMA- Human novel extracellular 
matrix protein, Seq ID No 366. 


HI 

71 


1 1 


1447 


gil8025476 


cercopithicine 
herpesvirus 15 


BPLF1 


HA 
71 


DO 


1447 


AAS14575_ 
aal 


Homo sapiens 


MILL- Human cDNA encoding G 
protein-coupled receptor, GPCR, 
52872. 


69 


oz 


1448 


gil4027507 


Mesorhizobium 
loti 


salicylate hydroxylase 


69 


31 


1449 


AAG64798 


Homo sapiens 


SREH- Human peptide methionine 
sulphoxide reductase (hPMSR). 


192 


• 71 


1449 


AAB81893 


Homo sapiens 


SEQU- Human genomic database 
related protem SEQ ID NO: 38. 


192 


71 


1449 


AAM42046 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6977. 


192 


71 


1450 


gil 8249657 


Mus musculus 


NC8 


1063 


80 


1450 


gi406748 


Mus musculus 


zinc finger protein 




J 1 


1450 


AAB43498 


Homo sapiens 


HUMA- Human cancer associated 
protem sequence oEQ ID NU:y4J. 


249 


37 


1451 


ABB89331 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 
NO 1707. 


732 


88 


1451 


gil3421927 


Caulobacter 
crescentusCB15 


MaoC family protein 


273 


4Z ! 


1451 


gil9338616 


Methylobacteriu 
m extorquens 


R-specific enoyl-CoA hydratase 


261 


44 


1452 


gi|20908171| 
rei[AJr / 
1S.1I 


Mus musculus 


similar to NADPH oxidase 3; NADPH 

mriHncp ratalvtifi <5iirYnnit-1ilce 3 


68 


30 


1452 


gi|17533619| 
refjNP 4955 
16.1| 


Caenorhabditis 
elegans 


F32A5.8.p 


67 


42 | 


1453 


gi|15614051| 
reflNP 2423 


Bacillus 
halodurans 


sodium-dependent phosphate 
transporter 


65 


34 
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Identity 




54.1| 










1454 


gi|17551878| 
reflNP 4990 
90.1| 


Caenorhabditis 
elegans 


TPR Domain 


76 


29 


1455 


AAM40727 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5658. 


191 


00 


1455 


AAM38941 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2086. 


191 


56 


1455 


gil9702127 


Homo sapiens 


P-Rexl protein 


191 


56 


1456 


ABB05666 


Homo sapiens 


GEHU- Human nucleic acid 
management protein clone amy2 1 ln4. 


496 


91 


1456 


AAE03372 


Homo sapiens 


HUMA- Human gene 1 8 encoded 
secreted protein fragment, SEQ ID 
NO:152. 


496 


91 


1456 


AAE03371 


Homo sapiens 


HUMA- Human gene 18 encoded 
secreted protein fragment, SEQ ID 
NO:150. 


496 


91 


1457 


AAM66940 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27246. 


290 


77 


1457 


AAM54534 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26639. 


290 


77 


1457 


AAM64410 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ED 
NO: 36515. 


287 


77 


1458 


AAB53445 


Homo sapiens 


HUMA- Human colon cancer antigen 
protem sequence SEQ ID NO:985. 


335 


100 


1458 


AAY30055 


Homo sapiens 


ARIA- Amino acid sequence of a 
FK506-binding protein (FKBP). 


165 


91 


1458 


AAQ52277_ 
aal 


Homo sapiens 


VERT- FK506 binding protein 
(FKBP12A) cDNA. 


159 


100 


1460 


AAU20255 


Homo sapiens 


HUMA- Human novel endocrine 
antigen, SEQ ID No 312. 


104 


76 


1460 


ABB 17663 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 6320. 


94 


77 


1460 


AAO02331 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 16223. 


88 


61 


1461 


AAM65951 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protem SEQ 
ID NO: 26257. 


97 


57 


1461 


AAM53568 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 25673. 


97 


57 


1461 


AAU83199 


Homo sapiens 


ZYMO Novel secreted protein 
Z891639G1P. 


96 


38 


1463 


gi5565687 


Homo sapiens 


topoisomerase-related function protein 


514 


75 


1463 


gl5 139005/ 


Homo sapiens 


T Alt'" 1 


HUO 


7S 


1463 


gi21430468 


Drosophila 
melanogaster 


LP06848p 


332 


51 


1464 


AAY91421 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 7 SEQ ID 
NO: 142. 


109 


35 


1464 


AAY91396 


Homo sapiens 


HUMA- Human secreted protein 


109 


35 • 
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cpmi*»nrp pncriHpd bv eene 7 SEO ID 

NO:117. 






4 A £ A 

1464 


A A "Vftl ISO 


tiomo sapiens 


"HTTX/TA- Human secreted nrotein 
<5pnupTice encoded bv sene 7 SEQ ID 
NO:73. 


109 


35 


1465 


A AT T1 ^Q7J2 


rioiiiu odpiciio 


TTTTMA- Human novel secreted protein, 
SeqID93L 


575 


100 


1465 


AAU15958 


Homo sapiens 


HUMA- Human novel secreted protein, 
Sen TD 91 1 


575 


100 


1465 


gil6041675 


Homo sapiens 


joined to JAZF1 


575 


100 


1466 


AAO01502 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 

lS\J 1JJ7H. 


173 


66 


1466 


gi|10947038| 
rei(Nl J xjoDl 
09.1] 


Homo sapiens 


ankyrin 1, isoform 1; ankyrin-1, 

Ciy ulUJoy u\/, aiJTk.yx.iLt. an. 


74 


28 


1 A £C 

1466 


gl|lUy4/U3o| 
rei|JN.r_uojz 

Uo. 1| 


uomo sapiens 


nnlrvrin 1 i^nffirm 4* aiikvrin- 1 . 

CtXuV YX1-L1 X, AOwAvJAAlA ~j Ct * ""•Jf * 1 1 1 x > 

ervthroevtic* aiikvrin-R 

Vl Y LIU VV J llVt UlllkJAlAA 


74 


28 


1467 


gil9354550 


Mus musculus 


similar to sre homology three (SIB) 
and cysteine rich domain 


842 


91 


1467 


AAU17352 


Homo sapiens 


HUMA- Novel signal transduction 
nathwav Drotein, Sea ID 917. 


361 


98 


140/ 


gll tyyDOO 


\4llQ milQPIlTll** 
lVxlio AAluoW lUUO 


stac 


302 


44 


1 A£Q 
1400 


*icn/:77i 
gUjjUO/ / 1 


\4iic mn cr*i line 


structural nrotein FBF1 


767 


74 


1468 


gi7549210 


Babesia 

\\\ fr attii n si 
DlgCILLUla 


200 kDa antigen p200 


213 


29 


140o 


gll /*f / 


ur y u iu laguo 

f»iiTiipn1iic 


tri ch nh valin 

LA IwIJAJ J.AJT 11111 


191 


30 


1469 


gil 1345048 


Homo sapiens 


SCAN domam-contaiiiing protein 2 


86 


32 




gii i jzuyw 




SCAND2 


86 


32 


1469 


gil4210722 


Tupaia 

namocinnic 


t41 


86 


30 


1470 


AAY88278 


Homo sapiens 


MILL- Human TANGO 1 88 protein. 


1442 


100 


1470 


gil4336711 


Homo sapiens 


similar to C. Eleeans protein F17C8.5 


1442 


100 


1470 


AAA39947_ 
aal 


Homo sapiens 


1V/TTT T Wnmnn TANGO 188 cDNA 


1438 


99 


1471 


AAE10204 


Homo sapiens 


HYSE- Human bone marrow derived 
contiff nrotein. SEO ID NO: 69. 


71 


44 


1471 


AAA23458_ 
aal 


Homo sapiens 


ALPH- cDNA encoding human 
secreted protein vp 1 5 1 , SEQ ID 
NO:71. 


67 


46 


1471 


AAB80228 


Homo sapiens 


GETH Human PR0269 protein. 


67 


46 


1472 


A AUCC/ITS 

AAiioo43o 


uomo sapiens 


TTPT T- Human membrane or secretorv 
protein clone PSEC0210. 


136 


86 


1472 




uomo sapiens 


TTPT T- Human "Drotein seauence SEO 
IDNO:17188. 


136 


86 


1472 


A A "CA 1 HA < 

AAEU1745 


Homo sapiens 


"HTTN/TA- Human treiiR 1 encoded 

flUiVLn AAUAAAOAA gvilw At wUwUUWU 

secreted protein HOGCS52 variant, 
SEqiDNO:160. 


136 


86 


1473 


gi9294201 


Arabidopsis 
thaliana 


disease resistance protein 


70 


24 


1474 


AAE19157 


Homo sapiens 


THOR/ Human kinase polypeptide 
(PKIN-15). 


631 


98 


1474 


AAM79131 


Homo sapiens 


HYSE- Human protein SEQ ID NO 


494 


72 



WO 03/080795 



PCT/US02/25485 



162 
Table 2 



SEQ 
ID 
LyXJl 


Accession 
No. 
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1793. 






1 A1A 


A A W1 OOOO. 

AAW lyyzu 


Homo sapiens 


jvcAji^ xiuman j\>sr ^Kinase suppressor 
ofRas). 


AQA 


79 


1 A1S 
14/j 


A ATM O/^AQ 
AAL/lZOUif_ 

aal 


Homo sapiens 


SAGA Human protein having 
nyciropnoDic domain encoamg cuina 
clone HP03974. 


03 / 


73 
/ J 


1475 


AA014199 


Homo sapiens 


INCY- Human transporter and ion 
cnannei iKJLUii-lo. 


657 


73 


1475 


AAE06614 


Homo sapiens 


SAGA Human protein having 
hydrophobic domain, HP 03974. 


657 


73 


1 AH£. 
1476 


gl 13905246 


Mils musculus 


DTVIJM *.TYVT A O/l 1 AAOylVOA „„„„ 

KIKliN cDNA 2410024K20 gene 


*71 

71 


1A 

34 


1476 


gi|17505208| 

•ta(TKTD AO 1 zT 

rer|rsr_Uolo 

9Q 11 


Mus musculus 


CD2 antigen (cytoplasmic tail) binding 
protein/; 1!)UUU11x5UzKik 


71 


34 


l*t / / 


rn'ftfiAAOl 

giouo^ty i 


jxanus 
norvegicus 


guanyiyi cyclase 


izin 




1 All 
l*t / / 


glZO^oUOO 


v^ams iarniiiaris 


guanyiaie cyclase xi 


1 1 R 
1 10 




1477 


gi2623074 


Bos taurus 


rod outer segment guanylate cyclase 
precursor 


116 


55 


1478 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


585 


73 


1478 


gil 86767 10 


Homo sapiens 


FLJ00254 protein 


408 


69 


1478 


AAO04042 . 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 

XT/^k 1 inn A 

NO 17934. 


392 


75 


1 A1Q 

14/y 




Homo sapiens 


(jbriu Human titm (connecting protem 
sequence. 


OAC 

ZUo 


1Q 

zy 


1 aiq 
14 /y 


gilzlzys/z 


Homo sapiens 


Protein sequence and annotation 
available soon via Swiss-Prot; available 
at present via e-mail from 
ladei i (fi^JVLDLr'XieiaeiDerg.iJi^ 


OAO 

zUo 


Ly 


1479 


gil7066105 


Homo sapiens 


Titin 


208 


29 


148U 


A A \TA A /COC 

AAV44oo5_ 
aal 


Homo sapiens 


TEXA Osteoclast inhibitor protem, 
OIP-1, coding sequence. 


OA 
94 


A 1 

41 


1480 


AAB35287 


Homo sapiens 


UROG- Human stem call antigen-2. 


94 


41 


1 AQ(\ 

14oU 


A A V00700 

AAYyy /uy 


Homo sapiens 


KJcuU Human stem cell antigen-z, 
hSCA-2. 


OA 

y4 


A1 
41 


1zlR1 


A A 57 AAA 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1672. 


1 01 
1ZZ 


inn 

1UU 


1/181 


guzo /Z 


Homo sapiens 


inteneron aipna/oera recepior 


1ZZ 


inn 

1UU 


1481 


AAQ49625_ 
aal 


Homo sapiens 


EUBI- Human interferon receptor 
extracellular domain coding sequence. 


118 


96 


1482 


AAD17516_ 
aal 


Homo sapiens 


SENO- Human taste receptor, hTlRl 
cluna cooing sequence. 


890 


94 


1482 


ABB77319 


Homo sapiens 


INCY- Human G-protein coupled 
receptor oty id inu j. 


890 


94 




A AT51 CYX11 


— ; 

Homo sapiens 


oEiNL/- numan taste receptor, ni ixvi 
protein. 


son 


QA 


1483 


gil8376312 


Neurospora 

vlUOJA 


related to SSD1 protein 


109 


39 


1483 


gi2645173 


Schizosaccbarom 
yces pombe 


sts5+ 


99 


42 


1483 


gi2459997 


Candida albicans 


protein phosphatase Ssdl homolog 


99 


40 


1484 


gi|18569064| 
ref|XP 0953 
78.1| 


Homo sapiens 


similar to 40S RIBOSOMAL 
PROTEIN S3A(V-FOS 
TRANSFORMATION EFFECTOR 


319 


96 
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/o 
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PROTEIN) 






1484 


gi|20539276| 
ref]XP 0952 
20.2| 


Homo sapiens 


similar to olfactory receptor MOR145- 
2 


259 


94 


1484 


gi|21295882| 

gb|EAA080 

27.1| 


Anopheles 
gambiae str. 
PEST 


agCP1347 


OO 




1485 


ABB11761 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO:2131. 


1 07 

17 1 


16 
jO 


1485 


gi930259 


Woolly monkey 
sarcoma virus 


reverse transcriptase (476 AA) 


148 


33 


1485 


gil 8076262 


porcine 

endogenous 

retrovirus 


Pol protein 


147 


38 


1486 


AAM74887 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 35193. 


1 TO 

I /z 


1 AA 


1486 


A AH K £f\t\0 C 

AAM62085 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 

VTfV 1A 1 OA 




inn 


1486 


gil52661 


PlasmidpSB24.2 


neomycin resistance protein 


75 


26 


1487 


gil2653493 


Homo sapiens 


Similar to brain acid-soluble protein 1 


/j 


1A 


1487 


gil7428832 


Ralstonia 
solanacearum 


PROBABLE AVRBS3-LIKE 

T»T» /"VP T? TXT 

PROTEIN 


75 


33 


1487 


gi7329672 


A 1 * J 

Arabidopsis 
thaliana 


phosphatidate cytidylyltransferase-like 
protein 


70 


A6 


1488 


AAU74754 


Homo sapiens 


INCY- Human protease PRTS-14 
protein sequence. 


2042 


83 


1488 


AAU74752 


Homo sapiens 


INCY- Human protease PRTS-12 
protein sequence. 


476 


39 


1488 


gil 1935122 


Mus musculus 


papilin 


431 


/I A 
41) 


1489 


gi|17543712| 
ref|NP 4999 
76.1| 


Caenorhabditis 
elegans 


Y55F3C.8.p 


72 


32 


1489 


gi|20344600| 
ref|XP_1095 
79.1| 


Mus musculus 


RIKEN cDNA 493343 1K05 


70 


OA I 

30 


1489 


gi|l 1692798| 
gb|AAG400 
02.1|AF320 
125 1 


Xenopus laevis 


ataxia telangiectasia and Rad3-related 
protein 


69 


26 


1490 


AAB95817 


Homo sapiens 


HELI- Human protem sequence aJby 

tt\ XT/"V 1 QQ 1 7 

ID NO:lool /. 


ZOO 


0-5 


1490 


ABB06369 


Homo sapiens 


rJUiJrv- Jtiuman neurogenesis reiaiea 
protein 12 SEQ ID NO:2. 


171 




1490 


AAB44394 


Homo sapiens 


HUMA- Gene 10 encoded human 
secreted protem fragment as BLA2S 1 a 

LflA&lJr ovVjUvliwwi 


83 


66 


1491 


gi438795 


Mus musculus 


serotonin 1 A receptor 


73 


26 


1491 


gil066326 


Mus musculus 


serotoninl A receptor 


72 


26 


1491 • 


gi|438795|gb 
|AAAl6850. 
II 


Mus musculus 


serotonin 1 A receptor 


73 


26 


1492 


gil6l98083 


Drosophila 


LD29875p 


87 


33 
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melanogaster 








1492 


gi2327063 


Pneumocystis 
carinii f. sp. 
cannn 


protease 1 


75 


34 


1492 


gi20420 


Prunus dulcis 


extensin 


75 


34 


1493 


AAG67087 


Homo sapiens 


SHAN- Human ATP-dependent serine 
protein hydrolase 13. 


106 


67 


1493 


AAM76636 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NU: ioy4z. 


103 


68 


1493 


AAM63822 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 35927. 


103 


68 


1494 


AAY31225 


Homo sapiens 


AVET Human RNA helicase pl35 
protein. 


73 


38 


1 Af\A 

1494 


gi3 123906 


: 

Homo sapiens 


pre-miviN/\. spiicmg iacior 


73 


38 


1494 


gil3278975 


Homo sapiens 


pre-mRNA splicing factor similar to S. 
cerevisiae Prpl6 


73 


38 


1495 


gi|17568307| 
ref|NP 5098 
37.1| 


Caenorhabditis 
elegans 


collagen 


74 


35 


1496 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


410 


81 


1496 


gi| 10834720| 
gb|AAG237 
90.1|AF258 
587 1 


Homo sapiens 


PP565 




77 
/ / 


1496 


gi|6753924|r 

/TV 'T f \ AO A 1 

ef|NP 0343 
74.1| 


Mus musculus 


Friend virus susceptibility 1 


127 


37 


1497 


gi20901968 


Caenorhabditis 
elegans 


C. elegans RPL-36 protein 
(corresponding sequence F37C12.4) 


71 
/ 1 




1497 


gi|17554754| 
ref|NP_4985 
73. 1| 


Caenorhabditis 
elegans 


Ribosomal protein YL39 


71 
/I 




1498 


gi5305335 


Mycobacterium 
tuberculosis 


proline-rich mucin homolog 


102 


27 


1498 


gi330130 


human 
herpesvirus 1 


latency associated transcript (LAT) 


97 


37 


1498 


AAU83682 


Homo sapiens 


GETH Human PRO protein, Seq ID No 

1 07 
loZ. 


94 


30 


1499 


AAY57937 


Homo sapiens 


INCY- Human transmembrane protein 

11 1 iVLrlN-O 1 . 


199 


81 


1499 


AAYJoziO 


— ; 

Homo sapiens 


nUlvi/v- nunioD secreiea protein 
encoacu oy gcuc 


1S1 


100 


1499 


AAG75708 


Homo sapiens 


HUMA- Human colon cancer antigen 

«M*s\fnrri CT7H TF> "\JTV A/177 

pro rein ocy jlu jnvj.oh/z. 


141 


92 


1500 


gi21428712 


Drosophila 
melanogaster 


SD05267p 


165 


54 


1500 


gi20975274 


Homo sapiens 


skeletrophin 


114 


40 


1500 


gil9773434 


Mus musculus 


skeletrophin 


99 


52 


1501 


ABB17830 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 6487. 


82 


37 


1501 


AA012929 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 26821. 


73 


43 
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0/ 

Vo 
luenuiy 


1502 


gi8778340 


Arabidopsis 
tbaliana 


F1504.13 ! 


77 


39 


1503 


AAW03515 


Homo sapiens 


SHKJ Human DOCK1 80 protein. 


144 


33 


1503 


gil339910 


Homo sapiens 


DOCK180 protein 


144 


33 


1503 


gil3195147 


Mus musculus 


HCH 


129 


25 


1505 


AAM70790 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 31096. 


77 


53 


1505 


AAM58316 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 30421. 


77 


53 


1505 


gi|21302711| 

gbjEAA148 

56.1| 


Anopheles 
gambiae str. 
PEST 


agCP4916 


77 


30 


1506 


AAU75102 


Homo sapiens 


MYRI- Heat shock protein 8 (Hsp8). 


592 


79 


1506 


AAB82535 


Homo sapiens 


UYCO- Human heat shock protein 
Hsc70. 


592 


79 


1506 


AAE12987 


Homo sapiens 


SRTW Human Hsp70 family 
homologue, Hsc70. 


592 


79 


1507 


ABL53627_ 
aal 


Homo sapiens 


GENO- Breast protein-eukaryotic 
conserved gene 1 (BSTP-ECG1) 
cDNA. 


213 


92 


1507 


ABB75677 


Homo sapiens 


GENO- Breast protein-eukaryotic 
conserved gene 1 (BSTP-ECG1) 
protein. 


213 


92 


1507 


AAY99421 


Homo sapiens 


GETH Human PR01433 (UNQ738) 
amino acid sequence SEQ ID NO:292. 


213 


92 


1508 


AAW15565 


Homo sapiens 


UYJO Human intracellular tyrosine 
kinase Tnkl -alpha. 


79 


29 


1508 


gi233062 


Gallus g alius 


src downstream region 


78 


33 


1508 


gil 8376366 


Neurospora 
crassa 


related to ribosomal protein S15 
precursor (mitochondrial) 


72 


30 


1509 


gi|21297482| 

gb|EAA096 

27.1| 


Anopheles 
gambiae str. 
PEST 


agCP15541 


68 


36 


1510 


AAM41631 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6562. 


127 


37 


1510 


AAM39845 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2990. 


127 


37 


1510 


AAM79502 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3148. 


127 


37 


1511 


gi21217669 


Mus musculus 


myosin IIIA 


70 


28 


1511 


gi|2 1302393| 

gb|EAA145 

38.1| 


Anopheles 
gambiae str. 
PEST 


agCP8799 


71 


36 


1511 


gi|20822589| 
refpO>_1408 

CA 1 1 

54. 1| 


Mus musculus 


similar to myosin IIIA 


70 


28 


1512 


gi6911049 


Babesia bovis 


p9.6.2-like variant erythrocyte surface 
antigen-la 


82 


28 


1512 


gi6911045 


Babesia bovis 


p9.6.2 variant erythrocyte surface 
antigen- la 


82 


28 


1512 


gi6911047 


Babesia bovis 


p8.4.1 variant erythrocyte surface 
antigen-la 


81 


28 
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1513 


gil0174843 


Bacillus 
halodurans 


maltose transport system (permease) 


77 


25 


1513 


gi56312 


Rattus 
norvegicus 


Gephyrin 


76 


31 


1513 


gi4325371 


Arabidopsis 
thaliana 


contains similarity to Medicago 
trancatula N7 protein (GB:Y17613) 


74 


28 


1514 


AAY14196 


Homo sapiens 


TAKE/ T cell receptor zeta chain 
protein sequence. 


95 


100 


1514 


gi623042 


Homo sapiens 


T-cell receptor zeta chain 


95 


100 


1514 


gi4960202 


Sus scrofa 


CD3 zeta chain 


95 


100 


1515 


ABB07508 


Homo sapiens 


INCY- Human aminoacyl tRNA 
synthetase (ATRS) polypeptide (ID: 
7474756CD1). 


726 


100 


1515 


AAB43670 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1115. 


604 


82 


1515 


gil464742 


Homo sapiens 


threonyl-tRNA synthetase 


604 


82 


1516 


gi21 109348 


Xanthomonas 
axonopodis pv. 
citri str. 306 


cytochrome B561 


77 


29 


1516 


gi21 114046 


Xanthomonas 
campestris pv. 
campestris str. 
ATCC 33913 


cytochrome B561 


76 


28 


1516 


gi|21243760| 
reflNP 6433 
42.1| 


Xanthomonas 
axonopodis pv. 
citri str. 306 


cytochrome B561 


77 


29 


1517 


ABB 11450 


Homo sapiens 


HYSE- Human neurotoxin homologue, 
SEQIDNO:1820. 


119 


33 


1517 


gi8809770 


Mus musculus 


Ly-6L1 


94 


30 


1517 


gi8809768 


Mus musculus 


lymphocyte antigen LY6I precursor 


94 


30 


1519 


gi|59977|em 
b|CAA7866 
2.11 


Human 

endogenous 

retrovirus 


tripartite fusion transcript PLA2L 


171 


67 


1519 


gi|17826947| 
dbjIBAB792 
87.1| 


Pseudomonas sp. 
ND137 


beta-l,4-xylanase 


73 


34 


1519 


gi|21232680| 
refpSIP 6385 
97.1| 


Xanthomonas 
campestris pv. 
campestris str. 
ATCC 33913 


ribonuclease PH 


72 


30 


1520 


AAM78023 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 38329. 


190 


100 


1520 


AAM65326 

✓ 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 37431. 


190 


100 


1520 


gil3447468 


Emericella 
nidulans 


FH1/FH2 protein homolog 


121 


49 


1522 


AAG81417 


Homo sapiens 


ZYMO Human AFP protein sequence 
SEQIDNO:352. 


287 


100 


1523 


AAY90349 


Homo sapiens 


SMIK Human fatty acid synthase 
(FAS) protein sequence. 


158 


85 


1523 


AAB43871 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 13 16. 


158 


85 
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Ucscnpuon 




Identity 


1523 


giy i5i*fz 


Homo sapiens 


Ialiy av/lU ojIiUlaoC 


158 


85 


1525 


AAG03819 


Homo sapiens 


GEST Human secreted protein, SEQ ID 

"MO* 7Qflft 


93 


100 


1525 


gil311466 


Homo sapiens 


24-kDa subunit of Complex I 


93 


100 


1525 


gil88852 


Homo sapiens 


IN AX/xi-UDiqumone reuuciasc 


Q1 
yD 


100 

1 \J\i 


1526 


AAD02855_ 
aal 


Homo sapiens 


oUKA riuman platelet memorane 
giycoprotem vi ^vjjr vij cluna. 


11 
id 


D 1 


1526 


A ATI Aft A f\1 

AAB49403 


: 

Homo sapiens 


jvjldxvq nuTTiftn glycoprotein v i uiaiuic 
protein. 


1% 

ID 


31 


1526 


AAB61257 


Homo sapiens 


MlLJL- Mature Human i ainljw zoo 
protein. 


11 
ID 


D I 


1527 


gil7864896 


Mus musculus 


protocadherin 1 8 precursor 


R1 

O 1 


D 1 


1527 


gil5980222 


Yersinia pestis 


aconitate hydratase 1 


79 


30 


1527 


gil2248353 


Fasciola hepatica 


NADH dehydrogenase subunit 5 


75 


56 


1528 


gi2440214 


Trypanosoma 
brucei brucei 


invariant surface glycoprotein 100 


83 


28 


1528 


gil0567463 


Rhizobium 
rhizogenes . 


probable virBl gene 


78 


22 


1529 


gi2231279 


Porcine 

reproductive and 
respiratory 
syndrome virus 


envelope protein 


OO 


11 
Dl 


1530 


gi|199851|gb 
|AAA39757. 


Mus musculus 


pol protein 


257 


42 


1530 


gi|1498648|g 

b|AAB0645 

0.1| 


Mus musculus 


. 

Gag-Pol polyprotein 


0^7 


HZ. 


1530 


gi|331995|gb 
1AAB03091. 
11 


AKV murine 
leukemia virus 


gag-pol polyprotein (tag amber codon 
at 2250-2252 inserts Oln m Mo-MuLV ) 


257 


42 


1533 


gi435698 


Homo sapiens 


CD44SP 


136 


100 


1533 


AAV63461_ 
aal 


Homo sapiens 


GEHO Human CD44 antigen cDNA. 


130 


100 


1533 


AAT14724_ 
aal 


Homo sapiens 


GEHO Human haematopoietic CD44 
cDNA clone CD44.5. 


130 


100 


1534 


gi2622165 


Methanothermob 
acter 

thermautotrophic 
us str. Delta H 


acetyltransferase 


71 


29 


1534 


gi|15679078| 

rfv t l % ah/1 

ref|NP 2761 
95.1| 


Methanothermob 
acter 

thermautotrophic 
us 


acetyltransferase 


71 


29 


1535 


gi7777 


Drosophila 
melanogaster 


protein H 


73 


28 


1535 


gi457146 


Plasmodium 
yoelii 


rhoptry protein 


73 


38 


1535 


gil3195258 


Plasmodium 
yoelii yoelii 


235 kDa rhoptry protein 


73 


38 


1536 


ABB09740 


Homo sapiens 


BODE- Amino acid sequence of human 
protein phosphatase 11.66. 


132 


43 


1536 


gi|20830386| 
reflXP 1456 


Mus musculus 


similar to importin alpha lb 


72 


35 
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NO: 


Accession 
No. 


Species 


uescripnon 


OtOic 


% 
Identity 




42. 1| 










1537 


gil4039907 


Rattus 
norvegicus 


cytochrome P450 monooxygenase 


353 


39 


1537 


gi2920650 


Mus musculus 


cytochrome P450 CYP2B19 


275 


44 


1537 


gi2353336 


Capra hircus 


cytochrome P450 


971 

Z / i 




1538 


AAU83175 


Homo sapiens 


ZYMO Novel secreted protein 


282 


100 


1538 


gi67 14803 


Stxeptomyces 
coelicolor A3 (2) 


integral membrane protein. 


77 


26 


1539 


gil2963397 


Prunus x 
yedoensis 


ribulose- 1 , 5-bisphosphate 
carboxylase/oxygenase large subunit 


74 


32 


1539 


gi466436 


Saccharomyces 
cerevisiae 


BOI1 


oy 




1539 


gi5833897 


Besleria affinis 


ribulose 1,5-bisphosphate carboxylase 
large subunit 


oy 


J 1 


1542 


AAY32193 


Homo sapiens 


INCY- Human receptor molecule 
(REC) encoded by Incyte clone 
044150. 


73 


26 


1542 


gi7576677 


Helicobacter 
pylori 


IceAl 


72 


44 


1542 


gi|20841498| 
ref|XP_1315 
41.1| 


Mus musculus 


similar to MUF1 protein 


73 


26 


1546 


gil4581448 


Homo sapiens 


FSHD Region Gene 2 protem 


ID 


*fZ 


1546 


gil5982852 


Arabidopsis 
thaliana 


AT5g66850/MUD21Jl 


71 


34 


1546 


gi|14581448| 

gb|AAK219 

77.1| 


Homo sapiens 


FSHD Region Gene 2 protein 


73 


42 


1547 


gil 8676660 


XT • 

Homo sapiens 


r u OOzzy protem 


1 QO 

iyz 


2*Z 


1547 


AAU21409 


Homo sapiens 


HUMA- Human novel foetal antigen, 
SEQ m NO 1653. 


1 70 

i /y 


inn 


1547 


AAM42128 


TT * _ 

Homo sapiens 


HYob- Human poiypepuae o.bvj id 
NO 7059. 


1 1 A 
1 l*t 


^i 


1548 


AAG64494 


Homo sapiens 


SHAN- Human natriuretic peptide 
receptor lo. 


539 


100 


1548 


gil8676710 


Homo sapiens 


FU00254 protein 


268 


77 


1548 


AAB28764 


Homo sapiens 


HUMA- Sequence homologous to 
protein fragment encoded by gene 21. 






1549 


AAB67055 


Homo sapiens 


INCY- Human immune response 
molecule ^uviujn ) protein onv</ id inv-f. 

A 

y. 


606 


82 


1549 


AAO01862 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 

XT/~\ 1 C1CA 

NO 15754. 


404 


72 


1549 


gi|6753924|r 
eflNP 0343 
74.1| 


Mus musculus 


Friend virus susceptibility 1 


213 


36 


1 s^n 


oi1Qfl19Q 




70kDa peroxisomal membrane protein 


92 


100 


1550 


gi825711 


Homo sapiens 


70kD peroxisomal integral membrane 
protein 


92 


100 


1550 


gi220862 


Rattus 
norvegicus 


PMP70 


89 


94 


1551 


AAM69543 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 


228 


100 
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No. 


Species ! 


Description 


Score 


0/ 

/o 

Identity 








ID NO: 29849. 






1551 


AAM57148 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 29253. 


ZZo 


1 nn 


1551 


AAB93944 


Homo sapiens ! 


HELI- Human protein sequence SEQ 
IDNO:13960. 


94 


^7 
J/ 


1552 


gi4884924 


Rangiferine 
herpesvirus 1 


glycoprotein C 


75 


34 


1552 


gi|18556240| 
reflXP 0676 
28.2| 


Homo sapiens 


similar to Salivary glue protein SGS-3 
precursor 


78 


30 


1552 


gi]4884924|g 

b|AAD3187 

6.1| 


Rangiferine 
herpesvirus 1 


glycoprotein C ' 


75 


34 


1553 


gi|2193870|d 
bj|BAA2041 
9.11 


Mus musculus 


reverse transcriptase 


176 


35 


1553 


gi|2731767|g 

b|AAC5354 

2.11 


Mus musculus 


endonuclease/reverse transcriptase 


1 ia 
1 /o 




1554 


ABB08776 


Homo sapiens 


BODE- Human neuregulin 55 bbQ ID 
NO 2. 


HZ 
/J 


OQ 


1554 


AAM92816 


Homo sapiens 


HUMA- Human digestive system 
antigen SEQ ID NO: 2165. 


*71 
/l 


00 

Zy 


1554 


gi|6322838|r 
ef|NP 0129 
ll.lf 


Saccharomyces 
cerevisiae 


Protein required for cell viability; 
Ykl014cp 


70 


07 

LI 


1555 


gi7528184 


Drosophila 
melanogaster 


bicoid-mteractmg protem BIN3 


TO 

78 


OQ 


1555 


gil5292595 


Drosophila 
melanogaster 


SD09926p 


78 


oc 


1555 


gi45 14620 


Mus musculus 


Ror2 


71 


O/l 


1557 


ABA91504_ 
aal 


Homo sapiens 


EYEE- Human epidermal growth factor 
receptor precursor cDNA. 


144 


93 


1557 


AAF85332_ 
aal 


Homo sapiens 


NOVS Nucleotide sequence of wild 
type EGFR1. 


144 


93 


1557 


AAM50768 


Homo sapiens 


EYEE- Human epidermal growth factor 
receptor precursor. 


144 


93 


1558 


AAB99950 


Homo sapiens 


SHAN- Human alKylatea-DM A-protem 
cysteine methyltransferase 14. 




inn 


1558 


AAU16267 


Homo sapiens 


HUMA- Human novel secreted protein, 
SeqID 1220. 


001 




1558 


ABB11507 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO: 1877. 




07 

y l 


1559 


fiil4599730 


Spachea correae 


maturase 


71 


28 


1559 


gil4599648 


Blepharandra 
heteropetala 


maturase 


71 


30 


i35y 




VJoXp OllUld 

gracilis 


UiaiUTabC 


70 


28 


1560 


gi2323287 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


340 


83 


1560 


Ri|13310191| 


multiple 


recombinant envelope protein 


260 


70 
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No. 
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% 
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• 


gb|AAK181 
89.1|AF331 
500_1 


sclerosis 
associated 
retrovirus 
element 








1560 


gi|21 103962| 
gb|AAM331 
41.1| 


Homo sapiens 


enverin-2 


248 


OA 

84 


1561 


AAB94698 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 15680. 


107 


95 


1561 


AAU18480 


Homo sapiens 


HUMA- Human endocrine polypeptide 
SEQ ID No 435. 


107 


95 


1561 


ABB10288 


Homo sapiens 


HUMA- Human cDNA SEQ ID NO: 
596. 


107 


95 


1562 


gi969078 


Drosopbila 
melanogaster 


S-adenosylhomocysteine hydrolase 


73 


26 


1562 


gi21064553 


Drosopbila 
melanogaster 


RE58316p 


73 


26 


1562 


AAM41205 


Homo sapiens 


HYSE- Human polypeptide SEQ ED 
NO 6136. 


72 


30 


1563 


gil778844 


Dictyostelium 
discoideum 


LimA 


71 


34 


1563 


gi|20985456| 
ref]XP 1421 
11.11 


Mus musculus 


similar to actin beta chain - human 


75 


36 


1563 


gi|1778844|g 

b|AAB4092 

9.1| 


Dictyostelium 
discoideum 


LimA 


71 


34 


1564 


gi|9507757|r 
eflNP 0614 
23.1| 


PlasmidF 


resolvase 


507 


91 


1564 


gi|148589|gb 
|AAA24900. 

11 


PlasmidF 


Protein D 


507 


91 


1564 


gi|10955295| 
refjNP 0526 
36.1| 


Escherichia coli 


resolvase 


501 


90 


1565 


gi7649370 


Arabidopsis 
thaliana 


guanine nucleotide-exchange-like 
protein 


77 


38 


1565 


gil674160 


Mycoplasma 
pneumoniae 


involved in cytadherence, see: 
MPN142 


71 


35 


1565 


gi|l 5229258| 
reflNP_1899 
16.1| 


Arabidopsis 
thaliana 


guanine nucleotide-exchange - like 
protein 


77 


38 


1566 


gil799600 


SwissProt 
Accession 
Number P3 1458 


similar to 


1051 


99 


1566 


gil38 14506 


Sulfolobus 

SOlIalallCuS 


Mandelate racemase /muconate 
lacionizing enzyme rciaicu piuLcm 
(MR/MLE) 


286 


35 


1566 


gil0640034 


Thermoplasma 
acidophilum 


starvation-sensing protein rspA related 
protein 


270 


35 


1567 


gil3359972 


Escherichia coli 
0157:H7 


acridine efflux pump 


573 


98 


1567 


gi 1773 144 


Escherichia coli 


probable transmembrane protein AcrE 


573 


98 
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No. 
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Score 


0/ 

Identity 


1567 


gi532311 


Escherichia coli 


1 1 4 kDa protein 




OS 


1569 


gi8918871 


YccA ofplasmid 
ColIb-P9] 
[Plasmid F 


96 pet identical to gp:AB021078_30 


288 


98 


1569 


gi|17136976| 
ref[NP 4770 
26.1| 


Drosophila 
melanogaster 


repo-Pl; Antibody RK2 


71 


33 


1569 


gi|6502544|g 
b|AAF14351 
.1|AF11019 
8 1 


Glomus 
intraradices 


homeobox protein HB 1 


70 


31 


1570 


gil3363792 


Escherichia coh 
0157:H7 


zinc-transporting ATPase 


410 


87 


1570 


gi466605 


Escherichia coli 


No definition line found 


410 


5 / 


1570 


gil2518128 


Escherichia coli 

0157:H7 

EDL933 


zinc-transporting ATPase 


410 


87 


1571 


AAU83186 


Homo sapiens 


ZYMO Novel secreted protein 
Z887014G7P. 


1006 


100 


1571 


gi7248459 


Zea mays 


arabinogalactan protein 


85 


29 i 


1571 


gi35 13742 


Arabidopsis 
thaliana 


contains similarity to Zea mays 
embryogenesis transmembrane protein 
(GB:X97570) 


82 


35 


1572 


gil2597465 


Caenorhabditis 
elegans 


CED-1 


72 


44 


1572 


gil9571666 


Caenorhabditis 
elegans 


similar to EGF-like domain 


72 


A A 

44 


1572 


gi4883938 


Drosophila 
melanogaster 


laminin alphal,2 


67 


31 


1573 


ABB12490 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 329. 


106 


38 


1574 


»1478205 


Mus musculus 


PNG protein 


75 


41 


1574 


AAM40148 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3293. 


69 


56 


1574 


AAM79341 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
2987. 


69 


35 


1576 


gi|20882651| 
reffXP 1233 
03.1| " 


Mus musculus 


ATPase, class 2, member b 


234 


91 


1576 


gi|7656918|r 
efINP_0566 
20.1| 


Mus musculus 


ATPase, class 2, member b; ATPase 
9B, class II; ATPase 9B, p type 


234 


91 


1577 


gil8143418 


Alteromonas sp. 
0-7 


chitinase A 


77 


39 


1577 


gil5426105 


Leishmania 
major 


probable surface antigen protein 


75 


24 


1578 


gil9702241 


Homo sapiens 


rabconnectin 


439 


93 


13/0 


gl /4 DZynO 


Jtiomo sapiens 


A-1L&.C 1 piULClIl 


132 


41 


1578 


gil279384 


Drosophila 
melanogaster 


X 


109 


29 


1580 


AAE20337 


Homo sapiens 


HUMA- Human B7-H1 1 protein 
mature extracellular domain. 


122 


23 


1580 


AAE20336 


Homo sapiens 


HUMA- Human B7-H1 1 protein 
extracellular domain. 


122 


23 
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1580 


gi2062702 


Homo sapiens 


butyrophilin 


122 


23 


1581 


AAE18640 


Homo sapiens 


INCY- Human G-protein coupled 
receptor (GCREC-1). 


70 


35 


1581 


gil8369751 


Oryza sativa 


ethylene responsive protein 


70 


50 


1581 


gil5217292 


Oryza sativa] 
[Oryza sativa 
(japonica 
cultivar-group) 


Putative AP2 domain containing 
protein 


70 


50 


1583 


gi6468047 


Homo sapiens 


Kruppel-like factor 


85 


73 


1583 


gi5916096 


Homo sapiens 


Kruppel-like factor LKLF 


85 


73 


1583 


gi4583418 


Homo sapiens 


Kruppel-like zinc finger transcription 
factor 


85 


73 


1585 


gi2570021 


Homo sapiens 


paired box containing transcription 
factor 


77 


-37 


1585 


gi3115988 


Homo sapiens 


dJ394P2'l.l (PAX-7) 


77 


37 


1585 


gi2570015 


Homo sapiens 


alternative 


77 


37 


1586 


gi7861533 


Rattus 
norvegicus 


retina specific protein PAL 


72 


43 


1586 


gi20977028 


Xenopus laevis 


mitotic phosphoprotein 39 


72 


34 


1586 


AAB58458 


Homo sapiens 


ROSE/ Lung cancer associated 
polypeptide sequence SEQ ED 796. 


68 


39 


1587 


gi5901864 


Drosophila 
melanogaster 


BcDNA.LD27873 


81 


24 


1587 


gil5458514 


Streptococcus 
pneumoniae R6 


Pneumococcal histidine triad protein D 
precursor 


78 


27 


1587 


gi5042400 


Homo sapiens 


NFI-X3=transcription factor [AA 


75 


30 


1592 


gi4210501 


Homo sapiens 


BC85722 1 


253 


61 


1592 


gil4794910 


Homo sapiens 


capicua protein 


253 


61 


1592 


gil4794914 


Mus musculus 


capicua protein 


253 


61 


1593 


gi|8131854|g 
b|AAF73108 
.1|AF14795 
6 1 


Trypanosoma 
cruzi 


antigen JL8 


69 


34 


1595 


gil8892729 


Pyrococcus 
fiiriosus DSM 
3638 


3-hydroxyisobutyrate dehydrogenase 


70 


27 


1595 


gi|20847046| 
refJXP 1366 
21.1| 


Mus musculus 


similar to Transcription factor BTF3 
(RNA polymerase B transcription 
factor 3) 


70 


28 


1595 


gi|18977088| 
ref]NP 5784 
45.1| 


Pyrococcus 
fiiriosus DSM 
3638 


3-hydroxyisobutyrate dehydrogenase 


70 


27 


1597 


AAU83621 


Homo sapiens 


GETH Human PRO protein, Seq ID No 
60. 


151 


42 j 


1597 


AAO05826 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 19718. 


146 


83 


1597 


AAM41346 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6277. 


102 


46 


1598 


AAM79503 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3149. 


80 


35 


1598 


AAM78519 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1181. 


80 


35 


1598 


gil 8676526 


Homo sapiens 


FLJ00160 protein 


80 


35 


1599 


gi2 149640 


Arabidopsis 


Argonaute protein 


72 


33 
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thaliana 








1599 ! 


gil 5027491 


respiratory 
syncytial virus 


glycoprotein 


71 
/ 1 




1599 


" 1 1 fill 1 TTl 

gi| 15221177| 
reflNP_1752 
74. 1 1 


Arabidopsis 
thaliana 


leaf development protein Argonaute 


79 




1601 


gil7l300l0 


Nostoc sp. PCC 
7120 


WD-40 repeat protein 


136 


28 


1601 


gil65363l 


Synechocystis 
sp. PCC 6803 


beta transducin-like protein 


131 


26 


1601 


gil7l3526l 


Nostoc sp. PCC 
7120 


WD-40 repeat protein 


115 


27 


1602 


gil 103853 


Rattus 
norvegicus 


rHAPl-A 


89 


33 


1602 


gil 103851 


Rattus 
norvegicus 


huntingtin associated protein 


89 


33 


1602 


gil4579673 


Takifligu 
rubripes 


pericentriolar material 1 protein 


87 


30 


1603 


gi537446 


Arabidopsis 
thaliana 


AtHbPlOl 


fj 


71 


1603 


gil2324908 


Arabidopsis 
thaliana 


neat shock protein iui, iouyj-ioz*fu 


7^ 
/D 


71 


1603 


gi6715468 


Arabidopsis 
tnaiiana 


heat shock protein 101 


75 


31 


1604 


gi2 190531 


Vibrio cholerae 


methyl accepting chemotaxis protein 


71 


26 


1604 


gi9657614 


Vibrio cholerae 


hemolysin secretion protein HylB 


71 
/ 1 


Of. 


1604 


gi9655306 


Vibrio cholerae 


heat shock protein GrpE 


70 


35 ! 


1605 


gi39l2936 


Geobacillus 

stearothermophil 

us 


ornithine carbamoyl transferase 


05 


1 1 


1606 


gi8797 


Drosophila 
melanogaster 


CYS3HIS finger protein 


678 


51 


1606 


gil529l975 


Drosophila 
melanogaster 


LD33756p 


Ol / 


CO 


1606 


gi6967l8l 


Homo sapiens 


c399E4.1 (similar to D.melanogaster 
unkempt protein.) 


549 


75 


1607 


gi|2l30l783| 
gb|EAAl39 

1 O 1 1 

28. 1|_ 


Anopheles 
gambiae str. 
PEST 


agCP8730 


11 




1607 


gi|21361276| 
ref|NPJ)060 
75.2| 


Homo sapiens 


interferon-stimulated transcription 
factor 3, gamma (48kD); interferon- 
stimulated gene factor 3, gamma 
subunit (48 kD) 


68 


29 


1609 


gi2661094 


Spinacia 
oleracea 


cold acclimation protein 


76 


32 


1612 


gi|1780975|e 
mb|CAA714 

1R 11 
io.l| 


Human 
endogenous 
rcixuviruo jv 


gag protein 


312 


34 


1612 


gi|5802810|g 

b|AAD5179 

1.11 


Homo sapiens 


Gag-Pro-Pol protein 


309 


34 


1612 


gi|887448|e 

mb|CAA513 

06.1| 


Human 

endogenous 

retrovirus 


gag 


309 


34 
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1613 


AA013889 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 27781. 


73 


42 


1614 


gil 1065727 


Homo sapiens 


dJ493F7.1 (similar to murme BET3) 


347 


1 AA 

100 | 


1614 


gi2791806 


Mus musculus 


bet3 


253 


69 


1614 


gil3277654 


Mus musculus 


Bet3 homolog (S. cerevisiae) 


253 


69 


1615 


gil 122901 


Saccharomyces 
cerevisiae 


MSP8 


77 


20 


1615 


gi825546 


Saccharomyces 
cerevisiae 


Cat8p 


77 


20 


1615 


gil7978563 


Xenopus laevis 


Spl-like zinc-finger protein XSPR-1 


75 


40 


1616 


AAY02536 


Homo sapiens 


ICOS- Human ICAM-6 protein 
sequence. 


458 


98 


1616 . 


^il2248907 


Homo sapiens 


TCAM-1 


458 


98 


1616 


gi4579740 


Rattus 
norvegicus 


testicular cell adhesion molecule 1 
(TCAM1) 


366 


76 


1617 


AAM67067 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27373. 


271 


64 


1617 


AAM54664 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26769. 


271 


64 


1617 


AAM56747 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28852. 


229 


69 


1618 


gi5802814 


Homo sapiens 


Gag-Pro-Pol-Env protein 


532 


52 


1618 


gil780973 


Human 
endogenous 
retrovirus K 


pol protein 


531 


52 


1618 


gi5802821 


Homo sapiens 


Gag-Pro-Pol protein 


531 


52 


1619 


gi2769587 


Mus musculus 


STOP protein 


662 


86 


1619 


gil370291 


Rattus 
norvegicus 


STOP protein 


662 


92 


1619 


gi3287265 


Rattus 
norvegicus 


E-STOP protein 


662 


92 


1620 


AAM65980 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26286. 


266 


100 


1620 


AAM53601 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 25706. 


266 


100 


1620 


gi|2027027l| 
ref|NP 6200 
S2.ll 


Mus musculus 


RIKENcDNA 1190017012 


1 ao 

198 


OA 

80 


1621 


gill 862941 


Mus musculus 


DDM36E 


74 


33 


1621 


gil 1862939 


Mus musculus 


DDM36 


74 


33 


1621 


gi7650186 


Mus musculus 


neighbor of Punc el 1 protein 


73 


33 


1622 


gi3 157464 


Thermus sp. A4 


integral membrane protein 


74 


38 


1623 


gi|59977|em 
b|CAA7866 
2.1| 


Human 

endogenous 

retrovirus 


tripartite fusion transcript PLA2L 


129 


82 


1623 


gi]20161147| 
dbjIBAB900 
75.1| 


Oryza sativa 

(japonica 

cultivar-group) 


VsaA -like protein 


88 


32 


1623 


gi|17864474| 


Drosophila 


domino 


87 


41 
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refjNP 5248 
33.1| 


melanogaster 








1626 


AAO00498 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 14390. 


99 


43 


1627 


gil4041733 


Xenorhabdus 
nematophila 


XptA2 protein 


70 


23 


1627 


gi|15641593| 
reflNP 2312 
25.1| 


Vibrio cholerae 


catalase 


69 


23 


1628 


gil9888204 


Methanopyrus 
kandleri AV19 


Site-specific DNA methylase 


80 


27 


1628 


gi6358691 


Simian 

immunodeficienc 
y virus 


Pol protein 


78 


32 


1628 


gi|20094956| 
ref]NP_6148 
03.1| 


Methanopyrus 
kandleri AV19 


Site-specific DNA methylase 


80 


27 


1629 


AAB07704 


Homo sapiens 


INMR Protein encoded by the 
endogenetic fragment of HERV-W. 


594 


67 


1629 


gi8272464 


Homo sapiens 


gag 


594 


67 


1629 


AAB07703 


Homo sapiens 


INMR Protein encoded by the 
endogenetic fragment of HERV-W. 


590 


66 


1630 


gi32498 


Homo sapiens 


precursor (AA -23 to 476) 


145 


100 


1630 


gi339595 


Homo sapiens 


triglyceride lipase precursor 


145 


100 


1630 


gi386859 


Homo sapiens 


hepatic lipase 


145 


100 


1631 


gi8777465 


Rattus 
norvegicus 


cytoplasmic dynein heavy chain 


703 


77 


1631 


gil7019507 


Tripneustes 
gratilla 


dynein heavy chain isotype IB 


505 


53 


1631 


AAB93815 


Homo sapiens 


HELI- Human protein sequence SEQ 
EDNO:13606. 


457 


71 


1632 


AAM68837 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29143. 


122 


48 


1632 


AAM56460 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28565. 


122 


48 


1632 


gil7861826 


Drosophila 
melanogaster 


GM01964p 


90 


51 


1633 


gi|21300783| 

gb|EAA129 

28.1| 


Anopheles 
gambiae str. 
PEST 


ebiP1105 


77 


33 


1633 


gi|19880523| 
gb|AAM003 
72.1|AF368 
053 1 


Bactrocera 
dorsalis 


vitellogenin 1 precursor 


68 


27 


1633 


gi|21070999| 
ref|NP 0659 
11.11 


Homo sapiens 


stromal interaction molecule 2 
precursor 


68 


39 


1637 


gi2323287 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


289 


91 


1637 


gi|21 103962| 


Homo sapiens 


enverin-2 


261 


82 
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gb|AAM331 
41.1| 










1637 


gi|13310191| 
gb|AAK181 
89.1|AF331 
500_1 


multiple 

sclerosis 

associated 

retrovirus 

element 


recombinant envelope protein 


259 


82 


1638 


AAR58809 


Homo sapiens 


UYNY Human RPTP-gamma. 


86 


26 


1638 


gi292411 


Homo sapiens 


receptor-type protein tyrosine 
phosphatase gamma 


86 


26 


1638 


gil263069 


Homo sapiens 


receptor tyrosine phosphatase gamma 


86 


26 


1639 


gi9857054 


Leishmania 
major 


possible CG7055 protein 


74 


27 


1639 


gi|20853034| 
ref|XP 1259 
62.1| 


Mus musculus 


expressed sequence AI447519 


73 


35 


1639 


gi|7008003|d 
bj|BAA9087 
4.1| 


Mus musculus 


transcription factor MAZR 


73 


35 


1640 


AAG03810 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7891. 


220 


95 


1640 


gil86800 


Homo sapiens 


ribosomal protein L12 


220 


95 


1640 


gi57680 


Rattus rattus 


ribosomal protein L12 


220 


95 


1641 


AAB44286 


Homo sapiens 


GETH Human PRO1072 (UNQ529) 
protein sequence SEQ ED NO:303. 


1709 


100 


1641 


AAY41730 


Homo sapiens 


GETH Human PRO1072 protein 
sequence. 


1709 


100 


1641 


gil4602625 


Homo sapiens 


PAN2 protein 


1709 


100 


1642 


gi20147241 


Arabidopsis 
thaliana 


AT5g09850/MYH9_6 


74 


32 


1642 


gil4329782 


Homo sapiens 


dJ1121G12.3 (Novel gene) 


72 


28 


1642 


gi| 16648730| 

gb|AAL255 

57.1| 


Arabidopsis 
thaliana 


AT5g09850/MYH9_6 


74 


32 


1643 


gi2952340 


Rattus 
norvegicus 


insulin receptor substrate 2 


89 


31 


1643 


gi2653351 


Bovine 

herpesvirus type 
1.1 


product of latency-related gene 


83 


30 


1643 


gi4511969 


Homo sapiens 


insulin receptor substrate-2 


82 


26 


1644 


gi9964099 


Chlamydia 
trachomatis 


inclusion membrane protein 


73 


35 


1644 


gil9171028 


Encephalitozoon 
cuniculi 


ATP DEPENDENT DNA BINDING 
HELICASE (RAD3/XPD 
SUBFAMILY OF HELICASES) 


67 


29 


1644 


gi|9964095|g 
b|AAG0982 
1.1|AF2793 
62 1 


Chlamydia 
trachomatis 


inclusion membrane protein 


73 


35 


1646 


gi|10863995| 
refjNP 0670 
11.1| 


Homo sapiens 


clones 23667 and 23775 zinc finger 
protein 


67 


42 


1647 


gil 196425 


Homo sapiens 


envelope protein 


93 


39 


1647 


gi200296 


Mus musculus 


perlecan 


85 


26 
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Trientitv 


1647 


gi8131894 


Homo sapiens 


mitofilin 


84 


27 


1648 


gil 573040 


Haemophilus ' 
influenzae Rd 


aspartokinase I / homoserine 
dehydrogenase I (thrA) 


11 




1648 


gi8778726 


Arabidopsis 
thaliana 


T25N20.14 


11 
ID 


11 


1648 


gi|16272063| 
ref]NP_4382 
62.1| 


Haemophilus 
influenzae Rd 


aspartokinase I / homoserine 
dehydrogenase I (thrA) 


11 

/i 


1& 
30 


1649 


gi295642 


Saccharomyces 
cerevisiae 


phospholipase C 


79 


36 


1649 


gi7548846 


Saccharomyces 
cerevisiae 


delta class phosphoinositide-specific 
phospholipase C homolog 


77 


36 


1649 


gil61104 


Schistosoma 
mansoni 


engrailed-like homeodomain protein 


74 


35 


1651 


gi|13 129464| 
gb|AAK131 
22.1|AC080 
019 14 


Oryza sativa] 
[Oryza sativa 
(japomca 
cultivar-group) 


Polyprotein 


00 


AC\ 
4U 


1652 


AAG81446 


Homo sapiens 


ZYMO Human AFP protein sequence 
SEQ ID NU:41U. 


249 


100 


1652 


gil8032212 


Homo sapiens 


histone acetyltransferase MOZ2 


89 


34 


1652 


AAR34936 


Homo sapiens 


UYJO CENP-B. 


77 


35 


1653 


gi20145484 


Bos taurus 


SCO-spondin 


*71 
/l 


70 


1655 


AAM86382 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 13975. 


129 


55 


1655 


ABB03887 


Homo sapiens 


HUMA- Human musculoskeletal 
system related polypeptide SEQ ID NO 
1834. 


1 1 o 

118 


OZ 


1655 1 


AAM75964 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded proteui SEQ 
ID NO: 36270. 


85 


56 


1659 


gi38035 


Homo sapiens 


p25 protein 


1 1 A 

1 1U 




1659 


gi330915 


Equine 
herpesvirus 1 


IR4 protein 


99 


28 


1659 


gil56606 


Chironomus 
tentans 


Spld 


QA 

o4 


1C\ 
D\J 


1660 


gi9654641 


Vibrio cholerae 


3-deoxy-D-manno-octulosonic-acid 
transferase 


84 


23 


1660 


gi|20835446| 

mm H A A A 

ref|XP 1444 
09.1| " 


Mus musculus 


similar to STARP antigen 


n 


75 


1660 


gi| 15596880| 
refjNP 2503 
74.1| 


Pseudomonas 
aeruginosa 


probable sugar aldolase 


10 


7fi 
ZO 


1661 


©4062318 


Escherichia coli 


Heat-responsive regulatory protein 


70 


16 


1661 


gi976025 


Escherichia coli 


HrsA 


79 


36 


1 ££1 
1001 


gii /ooyji 


jutScncnciiia wuu 
K12 


TvmtfMn mnHifi rati fin ^TiTvme induction 

of ompC 


79 


36 


1662 


AAM68588 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 28894. 


155 


100 


1662 


AAM56212 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 


155 


100 
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% 
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NO: 28317. 






1662 


gi3 845169 


Plasmodium 
falciparum JD / 


phosphatase (acid phosphatase family) 


OO 




1663 


AAG89215 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NU: 555. 


218 


100 


1663 


gi20070921 


Mus musculus 


RIKEN cDNA 2410008M22 gene 


130 


55 


1663 


AAR77602 


Homo sapiens 


FORS/ Human circulating cytokine 
CC-1 C-terminal fragment. 


OO 


A A 


1664 


AAE18212 


Homo sapiens 


CURA- Human MOL4 protein. 


75 


AH 

47 


1664 


AAM00966 


Homo sapiens 


HYSE- Human bone marrow protein, 
SEQ ID NO: 442. 


72 


35 


1665 


AAB92828 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:11365. 


74 


93 


1665 


AAG63852 


Homo sapiens 


INCY- Amino acid sequence of human 
GTPase activating protein GTPAP2. 


74 


93 


1665 


AAG63851 


Homo sapiens 


INCY- Amino acid sequence of human 
GTPase activating protein GTPAP1. 


74 


93 


1666 


AAM72897 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protem SEQ 
ID NO: 33203. 


135 


65 


1666 


AAM60268 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NU: 52.5 15. 


135 


65 


1666 


gi4007097 


Homo sapiens 


dJl 1 18D24.2 (60S Ribosomal Protein 

T 1 fi T TVT?\ 
JL1U JLJJsJbj 


135 


65 


1667 


gi2 12267 


Gallus gallus 


cartilage link protein 


917 


49 


1667 


gi2010 


Sus scrota 


link protein precursor (AA -15 to 339) 


Oil 

yi5 


CI 

31 


1667 


gi459439 


Equus caballus 


link protein 


910 • 


51 


1668 


gil0443237 


Mus musculus 


splicing factor 3a, subumt 2 


2/6 


36 


1668 


gi396743 


Podocoryne 
carnea 


Pod-EPPT 


276 


30 


1668 


gi294131 


Plasmodium 
falciparum 


circumsporozoite protein 


266 


22 


1669 


AAM49641 


Homo sapiens 


BOEH Human tumour-associated 
antigen B345 protem SEQ ID NO 4. 


132 


65 


1669 


AAU12252 


Homo sapiens 


GETH Human PR05773 polypeptide 
sequence. 


132 


65 


1669 


AAY91592 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 6 SEQ ID 
NO:265. 


132 


65 


1670 


gi4835383 


Homo sapiens 


anas DLCl 


zzo 




1670 


gi4704343 


Homo sapiens 


alias DLCl; candidate tumor 
suppressor gene 


226 


47 


1670 


gil55627 


Acanthamoeba 
castellanii 


myosin I heavy chain 


118 


42 


1671 


ABB 12490 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 329. 


237 


OO 

88 


10/1 


glOUlMSOZ 


oxrepiomyces 
fradiae 


giyCOSyi UallblCIaoC 






1671 


gi|9634613|r 
eflNP 0381 
50.1| 


Human 
papillomavirus 
type 69 


LI 


65 


39 ; 


1672 


gil3938013 


Homo sapiens 


Similar to RIKEN cDNA 2610509G12 
gene 


333 


66 
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1672 


gi2388970 


Schizosaccharom 
yces pombe 


tat-binding homolog 7, AAA ATPase 
family protein 


235 


41 


1672 


gi6850321 


Arabidopsis 
thaliana 


Contains similarity to YTA7 ATPase 
gene from Saccharomyces cerevisiae 
gb|X8l072, and contains Bromodomain 
PF|00439, AAA PF|00004, and Sigma- 
54 PF|00158 transcnption tactor 
domains. 


214 


40 


1673 


gil 10661 13 


Drosophila 
melanogaster 


Misexpression suppressor of ras 4 


71 


29 


1673 


gi|20829387| 
ref]XP_1295 
40.1| 


Mus musculus 


RIKEN cDNA 4930455F23 


77 


27 


1673 


gi|17647635| 
refINP_5237 
75.1| 


Drosophila 
melanogaster 


Misexpression suppressor of ras 4 


71 


29 


1674 


gi|20535935| 
refpO»_1157 
87.1| 


Homo sapiens 


similar to splicing coactivator subunit 
SRm300; RNA binding protein; AT- 
rich element binding factor 


75 


37 


1674 


gi| 17544226| 
ref|NP__5001 
51.11 


Caenorhabditis 
elegans 


Y76B12C.4.p 


72 


34 


1674 


gi|17559826| 
ref|NP_5057 
99.1| 


Caenorhabditis 
elegans 


sepB domain 


li\ 


ZD 


1675 


gi5708067 


Oryctolagus 
cuniculus 


hyperpolarization activated cation 
channel 


99 


27 


1675 


gi402558 


Cards familiaris 


mucin 


98 


27 


1675 


gil0636484 


Homo sapiens 


polyglutamine-containing protein 


96 


26 


1676 


AAM95365 


Homo sapiens 


HUMA- Human reproductive system 

1 a. J npA TT"V VTA. A AA O 

related antigen SEQ ID NO: 4023. 


73 


26 


1676 


AAB56709 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1287. 


72 


34 


1676 


gil881288 


Bacillus subtilis 


FUNCTION UNKNOWN, SIMILAR 
PRODUCT IN E.COLI, H. 

TVIT'T T TTTVT'"7 AT? A XTT*V XTTJTOOCDT A 

INFLUENZAE AND NEISSERIA 
MENINGITIDIS. 


71 


OA 

jU 


1677 


gi|l58925l2| 

ref|NP_3602 

a^ 1 1 
26.1 1 


EC:2.7.7.41] 

[Rickettsia 

conorii 


phosphatidate cytidylyltransferase 


(LQ 
OD 


'lA 


1679 


gil4231 


Saccharomyces 
cerevisiae 


NADH dehydrogenase (ubiquinone) 


/D 


J 1 


1679 


gl8U5U22 


Saccharomyces 
cerevisiae 


JNullp 


71 
ID 


11 
J 1 


1679 


■ O f A A a 

gil353352 


Chlamydomonas 
reinhardtii 


alanine aminotransferase 


i(\ 


01 


1680 


* + O AC A A 1 

gil 805421 


Bacillus subtilis 


surfactin production 


n 
1 1 




lUOv 




Bacillus subtilis 


srfA2 


77 


36 


1680 


Hi516360 


Bacillus subtilis 


surfactin synthetase 


77 


36 


1681 


AAG64494 


Homo sapiens 


SHAN- Human natriuretic peptide 
receptor 18. 


156 


80 


1681 


AAE16275 


Homo sapiens 


INCY- Human kinase PKIN-21 
protein. 


154 


73 


1681 


AAM40599 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 


154 


73 
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NO 5530. 






1682 


gi2323287 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


1 HA £. 

1646 


/D 


1682 


gi|2351212|d 
bj|BAA2206 
4.1| 


Friend murine 
leukemia virus 


gag-pol polyprotein (precursor protein) 


807 


40 


1682 


gi|9626961|r 
ef]NP 0579 
33.1| 


Murine leukemia 
virus 


Prl80 


802 


40 


1683 


AAM39205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2350. 


457 


53 


1683 


gi3033415 


Gibbon ape 
leukemia virus 


gag polyprotein 


353 


38 


1683 


gi|6524623|g 
b|AAF15097 

•11 


Phascolarctos 
cinereus 


gag protein 


343 


is 

DO 


1684 


gil91 10438 


Homo sapiens 


polycystin-lLl 


*71 O 

712 


QQ 

yo 


1684 


gi6361629 


Periplaneta 
americana 


vitellogenin 




ZD 


1684 


gi3 115393 


Ranapipiens 


: — — : 

guanylate cyclase inhibitory protem 




JJ 


1686 


AAY91542 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 92 SEQ ID 
NO:215. 


ziz 


OH 


1686 


gil279841 


Bos taurus 


glycine transporter 


72 


36 


1686 


gil9879917 


Oryza sativa 


acid phosphatase 


if\ 
70 


33 


1687 


gil2056568 


Homo sapiens 


MSTP063 


212 


OO 


1687 


gil3539684 


Homo sapiens 


zinc finger protein 291 


212 


88 


1687 


gi|12056568| 
gb(AAG479 
45.1|AF119 
814 1 


Homo sapiens 


MSTP063 


212 


88 


1689 


gi5689766 


Homo sapiens 


zinc finger 2.2 


222 




1689 


AAU16267 


Homo sapiens 


HUMA- Human novel secreted protein, 
Seq ID 1220. 


178 


58 


1689 


AAB99950 


Homo sapiens 


SHAN- Human alkylated-DNA-protem 
cysteine methyltransferase 14. 


1 n 

ill 


oo 


1690 


gi3328880 


Chlamydia 
trachomatis 


Protein Export 


73 


/y 


1690 


gi2832232 


Brucella 

melitensis biovar 
Abortus 


flagellin; FliC 


CI 

67 


iy 


1690 


gil7984285 


Brucella 
melitensis 


FLAGELLIN 


67 


29 


1692 


gi4927443 


Haemophilus 
influenzae 


hemoglobin/hemoglobin-haptoglobin 
binding protein 


93 


80 


LOyZ. 




XlaClliUpilii Uj 

influenzae 


hemoglobin and hemoglobin - 
haptoglobin binding protein 


93 


80 


1692 


gi3647226 


Haemophilus 
influenzae 


hemoglobin binding protein 


93 


80 


1694 


AAW95631 


Homo sapiens 


GEMY Homo sapiens secreted protein 
gene clone hj968 2. 


102 


100 


1694 


©13162186 


Homo sapiens 


calsyntenin-3 protein 


102 


100 
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/o 

Identity 


1695 


AAO04205 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 18097. 


81 


37 


1695 


gil60180 


Plasmodium 
cynomolgi 


circumsporozoite antigen 


81 


29 


1695 


gi495522 


Plasmodium 
simiovale 


circumsporozoite protein 


oO 




1696 


AAM80223 


Homo sapiens 


HYSE- Human protem SEQ ID NU 
3869. 


Zjz 


00 


1696 


AAM79239 


Homo sapiens 


HYSE- Human protem SEQ ID NO 
1901. 


252 


oo 


1696 


gi3 688394 


Homo sapiens 


triple LIM domain protein 


252 


66 


1697 


gil9887715 


Methanopyrus 
kandleriAV19 


Predicted membrane protein 


74 


28 


1698 


AAM93184 


Homo sapiens 


Til IT T TT 1 » * J fl T"~* /~\ TTPV 

HELI- Human polypeptide, SEQ ID 
NO: 2552. 


269 


87 


1698 


gil8044066 


Mus musculus 


RIKEN cDNA 5033406L14 gene 


226 


nfi 
fo 


1698 


AAB95302 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:17538. 


194 


78 


1699 


ABB17279 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 5936. 


110 


56 


1699 


AAO13013 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 26905. 


101 


71 


1699 


gi|7650258|g 
b|AAF65960 
.1|AF20777 
0 1 


Hepatitis C virus 

> 


polyprotein 


74 


28 


1700 


gil2697585 


Arabidopsis 
thaliana 


4-(cytidine 5'-phospho)-2-C-methyl-D- 
eritbritol kinase 


69 


40 


1701 


gil6740569 


Homo sapiens 


Similar to thymus expressed gene 3 


84 


27 


1701 


gil 7940760 


Mus musculus 


cask-interacting protein 2 




ZO 


1701 


gil7940758 


Homo sapiens 


cask-interacting protein 1 


77 


26 


1702 


gil7385401 


Homo sapiens 


TPIP alpha lipid phosphatase 


234 


oz 


1702 


AAU75783 


Homo sapiens 


INCY- Human protein phosphatase 1 
(PP1) protein sequence. 


208 


57 


1702 


AAG67638 


Homo sapiens 


HELI- Amino acid sequence of a 
human protein. 


202 


56 


1703 


AAO07887 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 21 779. 


246 


85 


1703 


AAO08651 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 22543. 


239 


83 


1703 


AAO08732 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 22624. 


221 


80 


1704 


AAB94588 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:15392. 


82 


52 


1704 


gi3288914 


Mus musculus 


aortic carboxypeptidase-like protein 
ACLP 


82 


24 


1704 


AAM93437 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 


O 1 

81 


Jz 


1706 


AAM86104 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 13697. 


179 


100 


1706 


gil0039425 


Equus caballus 


ALR protein 


120 


40 


1706 


gi20502826 


Eimeria maxima 


cGMP-dependent protein kinase 


115 


35 


1707 


AAM70251 


Homo sapiens 


MOLE- Human bone marrow 


115 


78 
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expressed probe encoded protein SEQ 
ID NO: 30557. 






1707 


AAM57834 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein abQ ID 
NO: 2993y. 


1 1 c 

115 


HQ 

to 


1707 


gil5450860 


Arabidopsis 
thaliana 


serine/threonine-protein kinase Mak 
(male germ cell-associated kinase)-like 
protein 


71 
/ 1 


jO 


1708 


gil620403 


Homo sapiens 


SFl-Bo isoform 


82 


41 ! 


1708 


gi 19072991 


Hypocrea virens 


class III chitinase precursor 


o 
oz 




1708 


gil8765873 


Hypocrea virens 


class III chitinase 


82 


40 


1709 


AAM52240 


Homo sapiens 


INCY- Human MFAP4 SEQ ID NO 3. 


1384 


100 


1709 


gi790817 


Homo sapiens 


microfibril-associated glycoprotein 4 


•t no A 

1384 


100 


1709 


AAM52239 


Homo sapiens 


INCY- Human MAG4V SEQ ID NO 1. 


1374 


100 


1710 


gil 6769882 


Drosophila 
melanogaster 


SD07884p 


67 


27 


1710 


gi|17545505| 
ref]NP 5189 
07.1| 


Ralstonia 
solanaceaium 


CONSERVED HYPOTHETICAL 
PROTEIN 


66 


A 1 

41 


1711 


AAU82954 | 


Homo sapiens 


AN AD- Human homologue of MPT 1 
protein target for antifungal compound. 


111 


27 


1711 


gi2058326 


Homo sapiens 


subumt of RNA polymerase II 
transcription factor TFIID 


111 


27 


1711 


gil3559031 


Homo sapiens 


bAHM20.1 (TATA box binding 
protein (TBP)-associated factor, RNA 

t TT /—1 1 11 /\1 T\\ 

polymerase II, CI, 130kD) 


108 


26 


1712 


AAB65626 


Homo sapiens 


SUGE- Novel protem kinase, SEQ ID 
NO: 152. 


AAA 

209 


82 


1712 


AAM25283 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO:798. 


209 


82 


1712 


AAU17269 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protein, Seq ID 834. 


176 


67 


1713 


gil8256065 


Mus musculus 


Similar to ATPase, class II, type 9A 


127 


67 


1713 


AAM76495 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protem SEQ 
ID NO: 36801. 


123 


70 


1713 


AAM63681 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 35786. 


123 


70 


1714 


gi8096269 


Nicotiana 
tabacum 


KED 


149 


Zo 


1714 


gil752736 


Saccharomyces 
cerevisiae 


gene required for phosphoylation of 
ohgosacchandes/ has high homology 
withYJR061w 


148 


30 


1714 


gi2292986 


Rattus 
norvegicus 


cyclic nucleotide-gated channel beta 
subumt 


141 


28 


1715 


AAM72995 


Homo sapiens 


MOLE- Human bone marrow 
expressed prooe encoaea proiem ony 
ID NO: 33301. 


158 


47 


1715 


AAM60359 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 32464. 


158 


47 


1715 


gi|13539605| 
emb|CAC35 


Paramecium 
tetraurelia 


cyclophilin-RNA interacting protein 


144 


45 
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733.1| 










1716 


AAM71015 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 31321. 


251 


64 


1716 


AAM58517 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 30622. 


251 


64 


1716 


AAU19766 


Homo sapiens 


HUMA- Human novel extracellular 
matrix protein, Seq JD No 416. 


161 


44 


1718 


Kil420924 


Zea mays 


INI 


75 


27 


1718 


gi|14521970| 
refINP_1274 
47.1| 


Pyrococcus 
abyssi 


O-sialoglycoprotein endopeptidase 


73 


35 


1719 


gi20513851 


Hordeum 
vulgare 


BPM 


74 


35 


1719 


gi2 1039 126 


Cryptosporidium 
parvum 


60 kDa glycoprotein 


74 


26 


1719 


gi207158 


Rattus 
norvegicus 


big tau 


73 


36 


1720 


gil8181943 


Caenorhabditis 
elegans 


heparan sulfate GlcNAc transferase-I/II 


67 


1A 


1720 


gi2058699 


Caenorhabditis 
elegans 


multiple exostoses homolog 2 


67 


34 


1720 


gi|17554740| 
ref]NP 4993 
68.1| 


Caenorhabditis 
elegans 


MULTIPLE EXOSTOSES 
HOMOLOG 2 


67 


34 


1721 


AAM69150 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29456. 


200 


38 


1721 


AAM56769 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28874. 


ry r\r\ 

200 


38 


1721 


gi4185947 


Human 
endogenous 
retrovirus K 


pol protein 


196 


38 


1722 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


615 


60 


1722 


gil8676710 


Homo sapiens 


FLJ00254 protein 


592 


60 


1722 


gi|20469453| 
reflXP_1140 
40.1| 


Homo sapiens 


similar to FU00254 protem 


283 


f A 

50 


1723 


gil3881755 


Mycobacterium 

tuberculosis 

CDC1551 


cation efflux system protein 


74 


30 


1724 


AAG78866 


Homo sapiens 


SHAN- Human zinc ringer protein 15. 


1 A 1 

141 


Oo 


1724 


ABB 17928 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 65<S5. 


QO 


*\1 
jj 


1724 


gi|21295712| 
57.1| 


Anopheles 

gall UtL<X& OUL . 

PEST 


agCP1631 


75 


26 


1725 


gi21 104340 


Homo sapiens 


obscurin 


1586 


83 


1725 


gi7024535 


Gallus gallus 


structural muscle protein titin 


207 


24 


1725 


gil513O30 


Gallus gallus 


connectin/titin 


207 


24 


1727 


AAE19162 


Homo sapiens 


THOR/ Human kinase polypeptide 
(PKIN-20). 


1096 


99 
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1727 


gi2736151 


Rattus 
norvegicus 


mytonic dystrophy kinase-related 
Cdc42-binding kinase 


902 


78 


111] 


gil 695873 


Homo sapiens 


ser-thr protein kinase PK428 


896 


77 


1728 


AAY99411 


Homo sapiens 


GETH Human PR01487 (UNQ756) 
amino acid sequence SEQ ID NO:260. 


862 


67 


1728 


gil5617453 


Homo sapiens 


chondroitin synthase 


862 


67 


1728 


AAE15959 


Homo sapiens 


EUMO- Human 4589624/92-303 
protein, member of Fringe and Brainiac 
family. 


761 


79 


1729 


gi|15804980| 
ref|NP_2909 
60.1 1 


Escherichia coli 
Ol57:H7 

T?T"\T r\nn 

EDL933 


Uncharacterized conserved protein 


71 


33 


1731 


gil4268490 


Musca domestica 


hunchback 


82 


33 


1731 


AAM93401 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
JNU: 30U2. 


76 


27 


1731 


gizu /oouo 


— - ; 

Musca domestica 


hunchback zinc finger protein 


ni 
15 


"3A 
3U 


nil 


A A VQt QAQ 


xiomo sapiens 


INCY- Human cytoskeleton associated 
protein 4 (CYSKP-4). 


104/ 


5/ 


mi 


ABB90754 


Homo sapiens 


UYJO Human Tumour Endothelial 
Marker polypeptide bhQ ID INO 240. 


1043 


57 


1732 


gi619577 


Gallus gallus 


cardiac muscle tensin 


1043 


56 


1733 


gi3090889 


Homo sapiens 


synapsin Hla 


70 


38 


1733 


gi6572355 


Homo sapiens 


cE86D10.1 (synapsin HI) 


70 


38 


1733 


gi|19924105| 
ref]NP_0034 
8l.2| 


Homo sapiens 


synapsin 111, isoform Hla 


70 


38 


1734 


AAB85144 


Homo sapiens 


HUMA- Human NKCR polypeptide 
(clone ID HMSOM53). 


1506 


93 


1734 


gi4973l26 


Mus musculus 
castaneus 


high affinity immunoglobulin gamma 
Fc receptor I 


490 


39 


1734 


gi4973l24 


Mus musculus 


high affinity immunoglobulin gamma 
Fc receptor I 


489 


39 


1735 


gi|15597595| 
ref|NP 25 10 
89. 1 1 


Pseudomonas 
aeruginosa 


pyoverdine synthetase D 


69 


30 


1736 


gil4488302 


Oryza sativa 


Putative transposon protein 


81 


24 


1736 


gi3851516 


Phytophthora 
lnfestans 


cyst germination specific acidic repeat 
protein precursor 


72 


33 


1736 


gi|14488302| 
gb[AAK638 
83.1|AC074 
105 12 


Oryza sativa 


Putative transposon protein 


81 


24 


1737 


AAB85357 


Homo sapiens 


INCY- Human phosphatase (PP) (clone 
ED 3402521 CD 1). 


1591 


100 


1737 


gi21205864 


Homo sapiens 


T-cell activation protein phosphatase 
2C; TA-PP2C 


1591 


100 


1737 


gi2 1464366 


Drosophila 

TYlf*1 QTIAfT'l pfar 

IllClaXlUgaS loi 


RE06653p 


758 


52 


1738 


gi727!8ll 


Drosophila 
melanogaster 


GTPase activating protein 


292 


38 


1738 


AAM76430 | 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 36736. 


246 


100 


1738 


AAM63615 


Homo sapiens 


MOLE- Human brain expressed single 


246 


100 
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0/ 

To 

Identity 








exon probe encoded protein SEQ ID 
NO: 35720. 






1739 


ABB50365 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 65 SEQ ID NO:313. 


272 


97 
Of 


1739 


AAW88598 


Homo sapiens 


HUMA- Secreted protein encoded by 
gene 65 clone HFVHY45. 


272 


87 


1739 


ABB50764 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 65 SEQ ID NO:716. 


143 


92 


1740 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


1210 


58 


1740 


gi|10834720| 
gb|AAG237 
90.1|AF258 
587 1 


Homo sapiens 


PP565 


274 


80 


1740 


gi|385615|gb 
1AAB26708. 
1| 


Mus sp. 


fibulin gene homolog 


248 


75 


1741 


ABB90748 


Homo sapiens 


UYJO Human Tumour Endothelial 
Marker polypeptide SEQ ID NO 228. 


2116 


97 


1741 


gi 15987493 


Homo sapiens 


tumor endothelial marker 6 


1 1 1 a 


y 1 \ 


1741 


ABB90754 


Homo sapiens 


UYJO Human Tumour Endothelial 
Marker polypeptide SEQ ID MO Z4U. 


530 


37 


1742 


ABB11753 


Homo sapiens 


HYSE- Human NOV/plexin-Al 
homologue, SEQ ID NU:212J. 


291 


90 


1742 


gil 665757 


Mus musculus 


plexin 1 


001 
Ly 1 


on 


1742 


gi6010217 


Homo sapiens 


NOV/plexm-Al protem 


291 


yu 


1743 


AAM79514 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3160. 


149 


90 


1743 


AAM78530 


Homo sapiens 


HYSE- Human protem SEQ ID NO 
1192. 


149 


90 


1743 


gil244510 


Homo sapiens 


p311 protein 


149 


90 


1744 


AAG93324 


Homo sapiens 


NISC- Human protem HP 10370. 


83 


A 1 

41 


1744 


gi21064771 


Drosophila 
melanogaster 


RH61467p 


83 


46 


1744 


gil 8676554 


Homo sapiens 


FLT00174 protein 


77 


41 


1745 


gi4128039 


Homo sapiens 


TL132 protein 


81 


29 


1745 


gil7983118 


Brucella 
melitensis 


METAL DEPENDENT HYDROLASE 


74 


23 


1745 


AAU75578 


Homo sapiens 


UYNA- Human ubiquitin specific 
protease 10(USP10). 


71 


31 


1746 


gil5074154 


Sinorhizobium 
meliloti 


PUTATIVE FATTY 
ACID/PHOSPHOLIPID SYNTHESIS 
PROTEIN 


76 


25 


1746 


gil869833 


human 
herpesvirus 2 


myristylated tegument protein 


75 


27 


1746 


gi20516045 


Thermoanaeroba 
cter 

tengcongensis 


Chemotaxis response regulator CheB, 
consists of CheY-like receiver domain 
and a methylesterase (demethylase) 

mjlllallt 


69 


20 


1747 


gil8025496 


cercopithicine 
herpesvirus 15 


EBNA-1 


124 


37 


1747 


gi5821153 


Homo sapiens 


RNA binding protein 


123 


29 


1747 


gi6649242 


Homo sapiens 


splicing coactivator subunit SRm300 


123 


29 


1748 


gi|4321764|g 
blAAD1581 


Mus musculus 


MAP kinase kinase 7 alpha 2 


65 


30 
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9.11 










1748 


gi|20859704| 
ref]XP 1339 
86.1| 


Mus musculus 


mitogen activated protein kinase kinase 
7 


65 


30 


1748 


gi|4321768|g 
b|AAD1582 

iii 


Mus musculus 


MAP kinase kinase 7 beta 2 


65 


3U 


1749 


AAB50964 


Homo sapiens 


GETH Human PR01313 protein. 


439 


89 


1749 


AAB47290 


Homo sapiens 


GETH PR01313 polypeptide. 


439 


89 


1749 


AAB24431 


Homo sapiens 


GETH Human PRO 1313 protein 
sequence SEQ ID NO:216. 


439 


on 

89 


1750 


AAU00502 


Homo sapiens 


MILL- Human TANGO 437 protein. 


115 


91 


1750 


gi20384654 


Homo sapiens 


two-pore calcium channel protein 2 


115 


91 


1750 


AAM91059 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 18652. 


93 


64 


1751 


gil0440494 


Homo sapiens 


FLJ00092 protein 


252 


97 


1751 


AAM40956 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 

VTA r Ann 

NO 5887. 


80 


30 


1751 


gi|10440494| 
dbj|BAB157 
80.1| 


Homo sapiens 


FLJ00092 protein 


252 


97 


1752 


gil5980036 


Yersinia pestis 


2-dehydro-3-deoxyphosphooctonate 
aldolase 


77 


46 


1752 


gil 1322261 


Diceros bicornis 


alpha adrenergic receptor 2B 


74 


26 ! 


1752 


gi205 16240 


Thennoanaeroba 
cter 

tengcongensis 


methylaspartate mutase 


73 


25 


1753 


gil96840l4 


Homo sapiens 


similar to brain-specific angiogenesis 
inhibitor 3 (H. sapiens) 


1387 


99 


1753 


AAB88367 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0101. 


1380 


99 


1753 


gil469936 


Mus musculus 


FGF-binding protein 


158 


29 i 


1754 


AAB01397 


Homo sapiens 


INCY- Neuron-associated protein. 


435 


92 


1754 


gi2l2l8l40 


Homo sapiens 


rab effector MYRIP 


435 


92 


1754 


gi2l320l6l 


Mus musculus 


exophilin 8 


378 


77 


1755 


AAM74815 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 35121. 


253 


75 


1755 


AAM62013 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 34118. 


253 


75 


1755 


AAM70390 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 30696. 


228 


62 


1756 


gi646020l 


Deinococcus 
radiodurans 


phenylacetic acid degradation protein 
PaaA 


85 


27 


1 /DO 


gu^uio^o 


1 oKHUgU 

rubripes 


IVLULr 


70 




1756 


AATl0059_ 
aal 


Homo sapiens 


USSH erbB-3 cDNA clone E3-16. 


74 


31 i 


1757 


gil8676406 


Homo sapiens 


FLJ00021 protein 


70 


36 


1758 


gil3423395 


Caulobacter 
crescentus CB15 


NADH dehydrogenase I, M subunit 


78 


37 
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1758 


gi|17506337| 
ref|NP 4913 
90.1| 


Caenorhabditis 
elegans 


D1007.15.p 


82 


24 


1758 


gi|16126181| 
ref]NP_4207 
45.1| 


Caulobacter 
crescentus CB15 


NADH dehydrogenase I, M subunit 


no 


D 1 


1759 


gil9881193 


chimpanzee 
cytomegalovirus 


transcriptional transactivator TRS1 


83 


29 


1759 


gil9881 161 


chimpanzee 
cytomegalovirus 


transcriptional transactivator IRS1 


oo 


Zy 


1759 


gi556297 


Mus museums 


alpha- 1 type IV collagen 


81 


I'X 


1760 


gil8033185 


Danio rerio 


UNC45-related protein 


702 


79 


1760 


AAG77802 


Homo sapiens 


XTT TTi jr A TT TT/ r "\/""»T"7XTC f\ 

HUMA- Human HOGEN50 
serine/threonine phosphatase protein 
sequence. 


olio 


OJ 


1760 


i 1 -» r i AAAA 

AAM40290 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3435. 


603 


OJ 


1761 


gi6634123 


Drosophila 
melanogaster 


SoxNeuro 


70 


24 


1762 


gi|14245700| 
dbj|BAB561 
42.1| 


Giardia 
intestinalis 


kinesin-like protein 4 


69 


26 


1762 


gi|165011|gb 
|AAA31246. 

11 


Oryctolagus 
cuniculus 


eucaryotic release factor (eRF) 


69 


24 


1762 


gi|15559188| 
emb|CAC03 
424.2| 


Homo sapiens 


dJ45P21.3 (butyrophilin, subfamily 3, 
member Al) 


69 


26 


1763 


AAM93661 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3536. 


186 


80 


1763 


AAM64398 


Homo sapiens 


MOLE- Human brain expressed single 

1 J -J a. ' OTA 1 1 "V 

exon probe encoded protein SEQ ID 
NO: 36503. 


154 


76 


1763 


gi|20556958| 
ref|XP_0615 

62.5]_ 


Homo sapiens 


similar to PAM COOH-terminal 
interactor protein 1 


73 


Al 

43 


1764 


AAU17223 


Homo sapiens 


rTT — xt — i — • 1 1 — : 

HUMA- Novel signal transduction 

pathway protein, Seq ID 788. 


oil 

Zl 1 


R7 


1765 


gll33454o 


Podospora 
anserina 


uod KJJi \ij grp jld protein 


71 
/ 1 


J i 


1765 


gl5679307 


Mus musculus 


KUKgamma t 


/ V 


97 


1765 


gi4186077 


Mus musculus 


ROR gamma T protein 


70 


27 


1766 


gil7864081 


Mus musculus 


PPAR gamma coactivator-lbeta protein 






1766 


gi44795 


Methanococcus 
voltae 


polyferredoxin 


71 


28 


1766 


gil4279670 


Lycopersicon 
esculentum 


verticillium wilt disease resistance 
protein 


71 


31 


1 /oo 




Homo sapiens 


0/\vJ/\ nuillail piULCill IlaYUlg 

hydrophobic domain, HP 10778. 


165 


100 


1768 


AAM40979 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5910. 


165 


100 


1768 


AAB24542 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 27 SEQ ID 
NO:168. 


73 


30 
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SFO 

ID 

NO* 


No. 


CnnAi AO 

opecies 


description 


Score 


0/ 
Vo 

Identity 


1769 


gi6174840 


Achromobacter 

ACAV1 /4 AM O 

A,y lO S OXlQallS 

subsp. 

xylosoxidans 


low-specificity D-threonine aldolase 


78 


33 




oii67fiQR0fi 


jjrosopxiiia 
melanogaster 


oDVZ ooup 


/J 


23 


1769 


gil098473 


Rattus 
norvegicus 


insulin-like growth factor binding 
protem 


73 


31 


1770 


AAP94684 


Homo sapiens 


CHIL Amino acid sequence encoded 
by part of human mannose binding 
protein(hMBP) genomic DNA. 


79 


56 


1770 
1 / /v 


giji j /yuDHo| 
refINP_2803 

79 11 
/Z.l| 


: 

Halobactenum 

sp. NRC-1 


cobyric acid synthase; CbiP 


69 


36 


1770 


gi|l 1467609) 
61.1| 


Guillardia theta 


Clp protease ATP binding subunit 


69 


27 


1772 


gi5532460 


Shigella flexneri 


ShiF 


66 


32 


1773 


ml 1544663 


thaliana 


PTPl^T^I 


/J 




1773 


gil 1595504 


Arabidopsis 

fti si liana 


PTPKIS1 protein 


75 


42 


1773 


gil8389331 


Mus musculus 


2',5 , -oligoadenylate synthetase-like 10 


73 


42 


1774 


A AM06*\1Q 


xiomo sapiens 


Jti x oxs- xiuman toetai protein, oJby JJJ 
NO: 250. 


A 1 A 

414 


90 


1774 


trill R5^99/l»l 
gl|iOJ jZZ'foJ 

ref|XP_0925 
10 11 


Homo sapiens 


similar to latent transforming growth 
factor beta binding protein 1 ; latent 
i Lrr oeia Dinning protem 


69 


37 


1775 


gi4884924 


Rangiferine 
nerpesYirus i 


glycoprotein C 


67 


60 


1775 


A ATCQ41 S9 


nomo sapiens 


— ; 

HELI- Human protem sequence SEQ 

JJL* rvil^. l*HO 3. 


o5 


34 


1775 


AAB93253 


XxUIXlU oapiCIlo 


xuc/i^i- xiuman protein sequence oxiv^ 
ID NO: 12271. 




in 


1776 


gil3424176 


Caulobacter 


N-carbamyl-L-amino acid 
ainiuonyuruxase 


89 


24 


177 '6 


gi514267 


Homo sapiens 


proto-oncogene tyrosine-protein kinase 


86 


29 


1776 


gi28237 


Homo sapiens 


pl50 protein (AA 1-1130) 


84 


28 


1777 


glOJj / u 


vjaiius gaiius 


dystropmn (AA 1 - 3ooU) 


OO 


31 I 


1777 


gi|3046783|e 

rnHlrA A6R0 

33.1| 


Scyliorhinus 
camcuia 


dystrophin 


67 


29 


1777 

1(1/ 


oi I93496R9 la 

b|AAB7040 
6.1| 


/lxaoiuopsis 
thaliana 


contains simiiariiy to xvattus /viVLr- 
activated protein kinase (gb|X95577). 


0/ 


1 1 


1778 


AAE16176 


Homo sapiens 


INCY- Human G-protein coupled 
receptor 7 (GCREC-7) protein. 


1419 


100 


1778 


AAE18021 


Homo saoiens 


CTriRA- Human Cr-nrntein cnnnled 
receptor-8a (GPCR-8a) protein. 


1419 




1778 


AAG72411 


Homo sapiens 


YEDA Human OR-like polypeptide 
query sequence, SEQ ED NO: 2092. 


1419 


100 


1779 


AAM76040 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 36346. 


93 


48 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 
identity 


1779 


AAM63227 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 35332. 


93 


A Q 


1779 


gil2620576 


Bradyrhizobium 
japonicum 


ID342 


87 


24 


1780 


gi2459833 


Rattus 
norvegicus 


Maxpl 


O 1 

81 




1780 


AAB65650 


Homo sapiens 


SUGE- Novel protein kinase, SEQ ID 
NO: 177. 


on 
* oU 




1780 


AAM39805 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2950. 


OA 


JO 


1781 


gi4877963 


Mus musculus 


NF-kappaB inducing kinase 


69 


39 


1781 


gil5077865 


Mus musculus 


bullous pemphigoid antigen 1-b 


67 


3D 


1781 


gil5077863 


Mus musculus 


bullous pemphigoid antigen 1-a 


67 


35 


1782 


gi4138265 


Nicotiana 
tabacum 


Avr9 elicitor response protein 


76 


27 


1782 


gil2725153 


Lactococcus 
lactis subsp. 
lactis 


SOS ribosomal protein L3 


75 


32 


1782 


AAB21008 


Homo sapiens 


INCY- Human nucleic acid-binding 
protein, NuABP-1 2. 


73 


32 


1783 


gi3947714 


Streptococcus 
agalactiae 


initiation factor IF2 


86 


20 


1783 


gi9558387 


Streptococcus 
agalactiae 


initiation factor 2 


86 




1783 


gi9558369 


Streptococcus 
agalactiae 


initiation Factor 2 


OO 




1786 


gi435855 


Mussp. 


CREB-bmding protem; CBr 






1786 


gi29 11464 


Leishmania 
tarentolae 


sodium stibogluconate resistance 
protein 


75 


34 


1786 


gil9547887 


Mus musculus 


CREB-binding protein 


75 


zz 


1787 


gi3747099 


Mus musculus 


Clq-related factor 


616 


61 


1787 


gil4278927 


Mus musculus 


gliacolin 


615 


64 


1787 


gil0566471 


Mus musculus 


Gliacolin 


615 


64 


1788 


gi|21291197| 

gb|EAA033 

42.1| 


Anopheles 
gambiae str. 
PEST 


agCP7579 


71 


20 


1788 


gi|20803964| 
emb|CAD31 
541.11 


Mesorhizobium 
loti 


HYPOTHETICAL PROTEIN 


69 


43 


1789 


AAM41125 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6056. 


320 


80 


1789 


AAM39339 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2484. 


320 


80 


1789 


AAM79857 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3503. 


320 


80 


1790 


gil 143585 


Paracentrotus 
lividus 


2 alpha fibrillar collagen 


69 


23 


1791 


gi9837427 


Lytechinus 
variegatus 


embryonic blastocoelar extracellular 
matrix protein precursor 


116 


34 


1791 


gil4089698 


Mycoplasma 
pulmonis 


OLIGOPEPTIDE ABC 
TRANSPORTER PERMEASE 
PROTEIN 


71 


23 


1791 


gi6572111 


Bartonella 


riboflavin synthase alpha chain 


69 


29 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 






quintana 








1792 


gi|4506023|r 
ef]NP 0027 
10.1| 


Homo sapiens 


protein phosphatase 2, regulatory 
subunit B (B56), gamma isoform 


68 


39 j 


1793 


AAM71170 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 31476. 


180 


82 


1793 


AAM58664 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 30769. 


180 


82 


1793 


AAM65679 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ED 
NO: 37784. 


168 


71 


1794 


AAG00072 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4153. 


125 


so ; 


1794 


AAW34618 


Homo sapiens 


IMUT- Human C3 protein mutant DV- 
7N. 


125 


80 


1794 


AAW34617 


Homo sapiens 


IMUT- Human C3 protein mutant D V- 
6. 


125 


80 


1795 


AAY05069 


Homo sapiens 


SMIK Human PIGR-2 protein 
sequence. 


1055 


85 


1795 


gi396170 


Homo sapiens 


CMRF-35 antigen 


406 


45 


1795 


gil8490143 


Homo sapiens 


CMRF35 leukocyte immunoglobulin- 
like receptor 


406 


45 


1796 


gi|6723273|d 
bjpAA8965 
9.1| 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


421 


41 


1796 


gi|13940448| 
gb|AAK503 
81.1|U43202 
2 


Murine leukemia 
virus 


pol precursor protein 


421 


41 


1796 


gi|331995|gb 
1AAB03091. 

1| 


AKV murine 
leukemia virus 


gag-pol polyprotein (tag amber codon 
at 2250-2252 inserts Gin in Mo-MuLV) 


421 


41 


1797 


gi21411325 


Homo sapiens 


Similar to LOC205 103 


260 


73 


1797 


gi|4835878|g 
b|AAD3028 
0.1|AF1348 
38 1 


Homo sapiens 


endocytic receptor Endol80 


77 


31 


1797 


gi|16076075| 
emb|CAC94 
295.11 


Leishmania 

donovani 

donovani 


trypanothione reductase 


70 


30 


1798 


gi927721 


Saccharomyces 
cerevisiae 


Siplp: SNF1 protein kinase substrate; 
YDR422C; CAI: 0.13 


72 


34 


1798 


gil72604 


Saccharomyces 
cerevisiae 


protein kinase 


72 


34 


1798 


gi|6320630|r 
ef|NP 0107 
10.11 


Saccharomyces 
cerevisiae 


SNF1 protein kinase substrate; Siplp 


72 


34 


1799 


gi|20839768| 
reflXP 1303 
11.1| 


Mus musculus 


similar to GDP-fucose transporter 1 


71 


29 


1801 


gi|17461642| 
reflXP 0662 


Homo sapiens 


similar to Ig kappa chain 


78 


23 
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SEQ 

m 

NO: 


Accession 
No. 


Species 


jjescripuon 




% 
Identity 




Af\ 1 I 

49.1| 










1801 


gi|6325342|r 
ef]NP_0154 

1 A 1 1 

10.1 1 


Saccharomyces 
cerevisiae 


Protein required for cell viability; 
i pruo^cp 


76 


22 


1801 


gi|9635081|r 
ef]NP 0578 
09.1| 


Gallid 

herpesvirus 2 


UL47 


74 


26 


1802 


A A T*» f\ A 1 A O 

AAB94148 


Homo sapiens 


rLtSJ-ii- Jtiuman proiein sequence onv^ 
ID NO: 14427. 




56 


1802 


AAG64564 


Homo sapiens 


orLAJN- xiuman zinc-iinger protein ou. 






1802 


AAM79356 


Homo sapiens 


HYSE- Human protem oHvJ ID iNU 
3002. 




5fi 


1803 


AAW81754 


Homo sapiens 


x3Uiir xiuman ranconi anaemia- 
associaiea gene 11 proieiii. 




85 


1803 


gi2407911 


Homo sapiens 


differentially expressed in Fanconi 
anemia 


555 


74 


1803 


gi6013073 


Mus musculus 


HemT-3 protein 


89 


24 


1805 


gil4189735 


Homo sapiens 


Ait-Dmomg cassene transporter 
family A member 12 




90 


1805 


gil943947 


Bos taurus 


auc transponer 


404 


31 


1805 


AAZ94734_ 
aal 


Homo sapiens 


FARB Human ATP binding cassette 

AbL/Al (Abtl/ CDJNA. 


395 


33 


1806 


AAU12234 


1 : 

Homo sapiens 


Lxiiiii Human ri\AJ4,3.>u polypeptide 
sequence. 




100 


1806 


AAA C\C~i A A 

AAA96344_ 
aal 


Homo sapiens 


ijrxilil cdina encoding a novei 
polypeptide designated PR04357. 


4Q8 


48 


1 OA/" 

1806 


A K'\T\ f \ A AC 

AAU 12445 


Homo sapiens 


vjii in xiuman x ivij*o J / poiypcpuae 
sequence. 


498 


48 


1807 


gil90396 


Homo sapiens 


profilaggrin 


76 


29 


1808 


AAB88367 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC0101. 


/4 




1808 


gil9684014 


Homo sapiens 


similar to brain-specific angiogenesis 
inhibitor 3 (H. sapiens) 


74 


30 


1808 


gi| 18576362| 
refpO>J)844 
81. 1| 


Homo sapiens 


snmlar to fibroblast growth tactor 
binding protein 1 


/4 




1809 


gi530876 


Chlamydomonas 
reinhardtii 


• ." — — . 

amino acid feature! Rod protem 

domain, aa 266 .. 468; amino acid 

feature: globular protein domain, aa 32 

.. ZOj 




^5 


1809 


gi6578849 


Myxococcus 
xanthus 


FrgA 


126 


29 


1809 


gi2429362 


Santalum album 


proline rich protein 


122 


27 I 


1810 


gil7428288 


Ralstonia 
solanacearum 


TRANSPORTING ATPASE 
LIPOPROTEIN TRANSMEMBRANE 


75 


28 


1810 


gi21483422 


Drosophila 
melano caster 


LLo414zp 


71 


29 


1810 


ABB90042 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 
NO 2418. 


70 


32 


1811 


gi|20915248| 
refjXP 1451 
60.1| 


Mus musculus 


similar to Collagen alpha 1(VI) chain 
precursor 


148 


74 


1812 


gi2104558 


Rattus 


CCA3 


1150 


90 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


jL/escripuuii 


Score 


% 
Identity 






norvegicus 








1812 


AAB64963 


Homo sapiens 


KUoli/ xiuman secretea proiein 
sequence encoded by gene 24 SEQ ID 


172 

1 f Mm 


37 


1812 


gil2963869 


Mus musculus 


gene trap ankyrin repeat containing 
protein 


172 


37 | 


1813 


AAB65201 


Homo sapiens 


GETH Human PRO1009 (UNQ493) 
proiem sequence ojdv^ aj-' iiv/.i7*r» 


208 


100 


1813 


AAY66678 


Homo sapiens 


GETH Membrane-bound protein 


208 


100 


1813 


AAB24068 


Homo sapiens 


GETH Human PRO1009 protein 
sequence oily iu inu.-do. 


208 


100 


1815 


AAG89314 


Homo sapiens 


GEST Human secreted protein, SEQ ID 


191 


100 


1815 


gi6460052 


Deinococcus 
radiodurans 


dipeptidyl peptidase IV-related protein 


66 


60 


1816 


gil052594 


Drosophila 
melanogaster 


trithorax protein trxl 


75 


26 


1816 


gil052593 


Drosophila 
melanogaster 


tntnorax protem trxn 


/ <J 


26 


1816 


gil58818 


Drosophila 
melanogaster 


zinc-binding protem 


7S 


26 


1817 


AAB49765 


Homo sapiens 


xibLi- riuman proineranon 
differentiation factor amino acid 
sequence. 




94 


1817 


AAB88393 


Homo sapiens 


HELI- Human membrane or secretory 
protem clone roni^ui j i . 


229 


94 


1817 


gil8446895 


Drosophila 
melanogaster 


AT05866p 


73 


25 


1818 


gi6573212 


Giardia 
intestinalis 


variant-specific surface protein H7-1 


73 


32 


1818 


gil59143 


Giardia 
intestinalis 


variant-specific surface protein H7 


73 


32 


1818 


gil5144254 


Micmrus 


neurotoxin homologue 8 


72 


32 


1819 


gil61857 


Tetrahymena 


surface antigen 


69 


35 


1 OO 1 

1821 




L^rcmoscoipius 
rouincucauua 


fantnr f 

iacior v-» 


80 


26 


1821 


gi217397 


Tachypleus 

ti'iHontahi e> 


limulus factor C precursor 


80 


26 


1821 




i acnypieus 
tridentatus 


fartnr C* nrpr , iii*<ifvr 


80 


26 


1822 




mn rolling 

iVLus mus cuius 


TYMA/TT1 accnrinfpH iwntpin-1 

LslVlVX X X £U)OiJL*l<*wU |*J1VJ twill J. 


74 


37 


1822 


gil666895 


Homo sapiens 


CHL1 protein 


74 


23 


1822 


gil6923930 


Mus musculus 


MATl-mediated transcriptional 
repressor 


74 


37 


1823 


ei9058659 


Cards familiaris 


skeletal muscle chloride channel C1C-1 


73 


34 


1823 


gi433182 


Drosophila 
melanogaster 


receptor protein tyrosine phosphatase 


72 


26 


1823 


gi20429105 


Paracoccus 

zeaxanthinifacie 

ns 


decaprenyl diphosphate synthase 


72 


27 


1824 


Kil3374178 


Mus musculus 


TAFE140 protein 


612 


88 
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ID 

X> \Jm 


Accession 


Cnnnioc 
opcClcS 


Tlpcrrintinn 


Score 


% 
Identity 


1 R94 


<ri17R61 RRR 

gll / OUIOOO 


TYrnQonnila 
JL»i \j ovsLXixixa 

melanogaster 


GM10839D 


246 


49 


1R94 


tlUV/JTU7U 


T)rn ^nnhi 1 a 

melanogaster 


BIP2 nrotein 


242 


48 


1K95 






G6b-C protein 


1159 


100 


1825 


gil6605484 


Homo sapiens 


G6b-E protein 


1009 


90 


1 

IoZj 


glD jU*fo / / 




imTminrnTlnViiili'ti Tecentor 


1003 


83 


1826 


AAB94636 


Homo sapiens 


HELI- Human protein sequence SEQ 
m xrO'is^i s 

ixJ ri\J. LJ J 1 J* 


105 


37 


loZO 


A ATT1 

AAU1 J>U3 


xiomo sapiens 


WTTA/f A - Hnmnn tiiyvpI <»prrptpd nrotein 

OClj JUL/ OJU. 


105 


37 


1826 


gi21430928 


Drosophila 

mp1»nno , actPT 
ilidaxxugaa it/X 


SD27341p 


93 


39 


1827 


AAR33270 


Homo sapiens 


WIST- T cell receptor alpha chain 
clone aloha 1.3. 


329 


92 


1827 


gil806100 


Homo sapiens 


T cell receptor alpha chain 


329 


92 


1 897 






TCRAV8S3 


329 


92 


1828 


©20513851 


Hordeum 

vn1 crnrp 
VUXgaXv 


BPM 


73 


45 


1828 


AAO01897 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 15789 


70 


35 


1 B9fi 

lozo 


A AP 16477 


nuiiiu bapiviid 


OSTR- Human collagen aloha 1 (TT\ 
protein. 


69 


31 


1 890 


A A<^66R^7 


TJ/^TYir* CQTY1PT1C 

noinu bapiciio 


SHAN- Human ATP-denendeiit serine 
proteinase 31. 


356 


100 


1 890 


A Af?66K'38 


X1UI11U bapiCIlo 


SHAN- Human ATP-denendent serine 
proteinase 3 1 N-tenninal peptide. 


89 


100 


1 coo 


gijooi i 


VjaLLUS gall US 


Ti/vmpnH Atria in nrAtpin 
iiuiiicuLiuii tain jjnjtt<in 


77 


38 


1830 


AAB94294 


Homo sapiens 


HELI- Human protein sequence SEQ 
TTi NCH474S 


951 


99 






urosopniia 
melanogaster 


-rhr* mianinp rmplpfYriHp PYPnanCTP faf*tOT 
IJ1U gUolxLiic xiiiv^ivV/ixuc vAVXxaxig& xawivi 

4 


180 


22 


loJU 


£1 07OO1 

giioiy /yzi 


jL/rosopniia 
melanogaster 


T DO^ 1 70n 


180 


22 


lOJ 1 




nonio sapiens 


TTVSP 1 - T-Fnmari hrmp marrow exnressed 

XX X kJii XXLL11XOX1 UUliw Ulflllv VT vA^iwOdwU 

protein SEQ ID NO: 107. 


199 


30 




giZVr JZ 1 0 1 


r^OTiic fatrtiliaric 
v^oxxio lq.li HAidiij 


rpttnitiQ niomentosa GTPase regulator 


143 


24 


1831 


gi2062609 


Xenopus laevis 


middle molecular weight neurofilament 
nrotein NF-Mf 1 i 


140 


24 


1832 


AAB29778 


Homo sapiens 


RHOD- Human MSF-derived 
li-ftonectin. 


148 


18 


IojZ 


en 1491 61 


A r»QT\1cjcma 
/vLUipiatSUla 

marginale 


cirrfflpp antiO'Pn Amfl05 


141 


25 


1832 


gi4808177 


Drosophila 
suDODScura 


largest subunit of the RNA polymerase 

TT /^rtmnlpY 
XI i^UIlxpiCA 


141 


20 


1833 


AAM66321 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26627. 


424 


51 


1833 


AAM53933 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ H) 
NO: 26038. 


424 


51 


1833 


gi|6723273|d 
bjlBAA8965 
9.1I 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


357 


47 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


ocore 


Identity 


1834 


AAM88756 


Homo sapiens 


HUMA- Human 

lrnmune/naemaiopoieuc antigen ojca^ 
ID NO: 16349. 


208 


100 


1834 


gi20417 


Persea americana 


cellulase 


77 


34 


1834 


gil53337 


Streptomyces ! 
tenebrarius 


kanamycin-apramycin resistance 
methylase 


69 


26 


1837 


AAY02893 


Homo sapiens 


HUMA- Fragment of human secreted 
protein encoded by gene 92. 


7A 

/o 


41 I 

HI 


1837 


AAY99429 


Homo sapiens 


GETH Human PRO 156 3 (Uisy/oy) 
amino acid sequence SEQ ID NO:317. 


Id 


7S 


1837 


gi6634084 


Drosophila 
melanogaster 


malate dehydrogenase (NAUr- 
dependent oxaloacetate 
decarboxylating), malic enzyme 


77 

ID 


70 


1838 


gi2865602 


Saccharopolyspo 
ra sp. 


Sapl M2 methyltransferase 


77 
1 1 


77 
Dl 


1838 


gi3089358 


Rattus 
norvegicus 


MARRLC2A 


75 


33 


1838 


gi|2865602|g 

b|AAC9718 

2.1| 


Saccharopolyspo 
rasp. 


Sapl M2 methyltransferase 


77 


37 


1839 


AAM69149 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 29455. 


154 


96 


1839 


AAM56768 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 28873. 


1 KA 


yO 


1839 


AAW96209 


Homo sapiens 


SMIK Amyloid precursor protein 
(APP) C-terrninal fragment. 


1 AO 


7fl 

/o 


1840 


gi9946563 


Pseudomonas 
aeruginosa 


probable type II secretion system 
protein 


Q 1 

ol 


7£ 
DO 


1840 


gi21 108565 


Xanthomonas 
axonopodis pv. 
citri str. 306 


pseudouridylate synthase 


75 


35 


1840 


ABB04714 


Homo sapiens 


Tf A "V T TT TVTk 1 T A A - ■ - *- - - OCA 

SHAN- Human PP1744 protein SEQ 
IDNO:23. 


7 A 

74 


ol 


1841 


gil491949 


Molluscum 
contagiosum 
virus subtype 1 


MC006L 


c< 

OJ 


7ft 

D\J 


1841 


AAM42085 


Homo sapiens 


HYSE- Human polypeptide abi^ i±) 
NO 7016. 


$21 
ol 


01 


1841 


AAM40299 


Homo sapiens 


xiYoJb- Human polypepnae ony lu 
NO 3444. 


R1 
o 1 


01 


1842 


gi20381413 


Homo sapiens 


Similar to LOC160680 


216 


44 


1842 


gi!3592175 


Leishmania 
major 


PPg3 


1 AA 


OA 


1842 


gi5420387 


Leishmania 
major 


proteophosphoglycan 


1 A(\ 
14U 


77 
LO 


1 2A1 
LoHO 


A AR871R1 

J\J\DO / 101 


XlULUU bdpiClla 


MILL- Human secreted protein 
MANGO 349 E41D variant, SEQ ID 
NO:231. 


278 


42 


1843 


AAB87128 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349, SEQ ID NO: 130. 


278 


42 ! 


1843 


AAB87179 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349 121K variant, SEQ ID 


276 


41 
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NO:227. 






1844 


AAE14341 


Homo sapiens 


INCY- Human protease PRTS-6 
protein. 


886 


93 


1844 


gil6768276 


Drosophila 
melanogaster 


GH27809p 


290 


41 


1844 


gi2655204 


Mus musculus 


ubiquitin-specific protease 


258 


35 I 


1846 


AAY88300 


Homo sapiens 


MILL- Human TANGO 187-3 protein. 


1334 


90 


1846 


gil3097780 


Homo sapiens 


Similar to RIKEN cDNA 2810037C14 
gene 


1326 


90 


1846 


AAY88296 


Homo sapiens 


MILL- Human TANGO 1 87-2/3 
protein. 


1312 


87 


1847 


AAG74984 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:5748> 


75 


32 


1847 


gil7352449 


Rattus 
norvegicus 


ErbB3/Her3 precursor 


74 


38 


1847 


gi|20860870| 
reflXP 1256 
64.1| 


Mus musculus 


similar to H4(D10S170) protein 


75 


32 


1848 


gi3123530 


Fowlpox virus 


fpI3L, orthologue of vaccinia BL 


75 


27 \ 


1848 


gi5902659 


Drosophila 
melanogaster 


ring canal protein 


70 


27 


1848 


gi|18110218| 
reflNP 4765 
89.2| 


Drosophila 
melanogaster 


kel-P2 


70 


27 


1849 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


614 


78 ! 


1849 


AAM65715 


Homo sapiens ! 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26021. 


548 


73 


1849 


AAM53338 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ED 
NO: 25443. 


548 


73 ! 


1850 


gil0999071 


Lophognathus 
longirostris 


NADH dehydrogenase subunit 2 


74 


23 


1850 


gil8537243 


Human 

immunodeficienc 
y virus type 1 


envelope glycoprotein 


74 


29 


1850 


gi|1099907I| 
gb|AAG006 
22.2|AF128 
462 2 


Lophognathus 
longirostris 


NADH dehydrogenase subunit 2 


74 


23 


1851 


gi|17448210| 
ref|XP 0685 
03.1| 


Homo sapiens 


similar to 60 kDa heat shock protein, 
mitochondrial precursor (Hsp60) (60 
kDa chaperonin) (CPN60) (Heat shock 
protein 60) (HSP-60) (Mitochondrial 
matrix protein PI) (P60 lymphocyte 
protein) (HuCHA60) 


72 


28 


1852 


gil 164937 


Saccharomyces 
cerevisiae 


YOR3160w 


74 


31 


1852 


gi3 176662 


Arabidopsis 
thaliana 


Similar to mannosyl-oligosaccharide 
glucosidase gb|X87237 from Homo 
sapiens. 


73 


31 


1852 


gil3398928 


Arabidopsis 
thaliana 


alpha-glucosidase 1 


73 


31 


1853 


gi|20889364| 


Mus musculus 


similar to hepatitis A virus cellular 


76 


36 
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reflXP 1384 
29.1| 




receptor 1; T cell immunoglobin 
domain and mucin doamin protein 1 






1853 


gi|21288202| 

gb|EAA005 

23.11 


Anopheles 
gambiae str. 
PEST 


agCP9342 


71 


32 


1854 


AAB88481 


Homo sapiens 


HELI- Human membrane or secretory 
protein clone PSEC025 1 . 


776 


99 


1854 


AAE03835 


Homo sapiens 


HUMA- Human gene 18 encoded 
secreted protein HFKHW50, SEQ ID 
NO: 81. 


776 


99 


1854 


AAE03863 


Homo sapiens 


HUMA- Human gene 1 8 encoded 
secreted protein HFKHW50, SEQ ID 
NO: 109. 


716 


97 


1855 


gil663748 


Chlamydomonas 
reinhardtii 


dynein heavy chain 7 


82 


29 


1855 


gil663744 


Chlamydomonas 
reinhardtii 


dynein heavy chain 5 


80 


28 


1855 


gil663738 


Chlamydomonas 
reinhardtii 


dynein heavy chain 2 


80 


27 


1856 


gil8032120 


Gallus gallus 


shal-like voltage-gated potassium 
channel 


75 


23 


1856 


gil408569 


Haemophilus 
influenzae 


adhesion and penetration protein 


71 


28 


1856 


gi| 18032120] 
gb|AAL566 
33.1|AF075 
160 1 


Gallus gallus 


shal-like voltage-gated potassium 
channel 


75 


23 


1857 


AAM67180 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27486. 


129 


44 


1857 


AAM54795 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26900. 


129 


44 


1857 


gi|21040255| 
reflNP 6319 
07.1| ~ 


Homo sapiens 


splicing factor, arginine/serine-rich 12 


109 


29 


1858 


gi21392190 


Drosophila 
melanogaster 


RE74758p 


71 


39 


1858 


gi9954108 


Trypanosoma 
cruzi 


RNA binding protein RGGm 


68 


40 


1858 


gi20302994 


Medicago 
tnincatula 


nodule-specific glycine-rich protein 1C 


66 


32 


1859 


gi|20536244| 
ref|XP 0605 
05.4| ~ 


Homo sapiens 


similar to autoantigen La 


72 


30 


1860 


gi|1754 1362| 
reflNP_5024 
09.1| 


Caenorhabditis 
elegans 


K08E7.5.p 


103 


29 


1860 


gi|17446900| 
ref|XP 0658 
33.1| 


Homo sapiens 


similar to DNA-directed RNA 
polymerase (EC 2.7.7.6) II largest 
chain - Mastigamoeba invertens 
(fragment) 


100 


34 


1860 


gi|9628166|r 
ef|NP_0427 


African swine 
fever virus 


CD2 homolog 


98 


30 
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52.1| 










1861 


AAY70691 


Homo sapiens 


DAND Human membrane attracting. 


162 


40 


1861 


AAY70690 


Homo sapiens 


DAND Human membrane attractin-1 . 


162 


40 


1861 


gil2275390 


Rattus 
norvegicus 


membrane attractin 


162 


40 


1862 


gil0039425 


Equus caballus 


ALR protein 


81 


28 


1862 


gil3529521 


Mus musculus 


Similar to elastin microfibril interface 

1np.ii ten nrnt"pin 

IVJlsO.lbVl piUlvslll 


80 


32 


1862 


AAM40414 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3559. 


79 


39 


1 


trill f^RR^RQl 

gb|AAL267 
87.1|AF304 
442 1 


nomo sapiens 


is lyropnocyie acnvanon-reiaiea protein 
BC-1514 


0/17 




1863 


gi|20479028| 
reflXP 1137 
29.1| 


Homo sapiens 


similar to B lymphocyte activation- 
Tel atpfi nrotein RP-1 S 1 4 


117 


68 


1863 


eil21301715l 
gb|EAA138 
60.1 1 


A n onh p1 p Q 

■TXlXyJ pil t> 1 1/3 

gambiae str. 
PEST 


aaCP8366 


OJ 


41 


1864 


AAU15851 


Homo sapiens 


HUMA- Human novel secreted protein, 
Sen TD 804 


1275 


78 


1864 


AAU16312 

AAU luJ IX* 


TTrnnn Qam'pn^ 

xxvaiuvj oapiLuj 


rTTTM^A* Hnmnn nnvpl Qpprpfpn 1 ■nrntp»in 

llUlVirV" llU-llluli I1UVC/1 OtV-'lClC/Ll pXV/lwXXl) 

Seq ID 1265 


1 193 




1864 


AAG02054 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6135. 


308 


91 


1865 


AAB94953 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 16485. 


86 


29 


1865 


ei3746787 


Homo <5anien<» 


SYT interacting nrotein STP 

u x x liiiwiauuiig ^/l v LW/JUH oil 




29 


1865 


gil 5022507 


Homo sapiens 


coactivator activator 


86 


29 


1866 


eil7133332 


"No^tne <m PPP 
xiuoiuw op. rv/v 

7120 


TYrPTYrrvfpiTi trrniclnracp Spr»"V cnlviii'nt' 

pivpiLFlC/111 lltLLLolUUaot' uCt X OlXUUJJll 


uo 


43 


1866 


ref|NP_0687 
73.1) 




crat^ inflation TM*r»+#»"iT» nlrtho ^ AAVT"^ 
gap JUilUliUii piULCillj aipjla J, tUJKJL/ 

(connexin 46) 


oo 


4ft 

HI/ 


1867 


gi706930 


Rattus 
norvegicus 


cyclic GMP stimulated 
oho snho dies terase 

UiivwUiivUJLvu IVl UJw 


191 


95 


1867 


AAV54762_ 
aal 


Homo sapiens 


UNIW Human cGS-PDE cDNA DNA 
seqeucne. 


137 


100 


1867 


AAV36157 
aal 


Homo <janiPTiQ 

1 ±\JL11\J OwpXVXXd 


T TM 1 HiiTTifln r»vp1ip-(TAy1~P— ■niifOpnriH** 

UJ.11 V V XXUUiOil wYV/lAw~VJlYJjr HUvlvUUUv 

phosphodiesterase cDNA. 




ion 


1868 


AAB95695 


Homo sapiens 


HELI- Human protein sequence SEQ 
TTjNO-18S16 


112 


27 


1868 


AAY91447 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded oy gene *f o oEy iu 
NO- 168 


112 


27 


1868 


AAY91393 


Homo saniens 


T~TT JA/f A- Wnman QppretpH nrntpin 

llUlYiri" 11LU1UU1 o&LflbL&lX piVJLVslll 

sequence encoded by gene 48 SEQ ID 
NO: 114. 


112 


27 

Z, / 


1870 


AAU07886 


Homo sapiens 


WHED Polypeptide sequence for 
human hspG15. 


1454 


94 


1870 


gil3603891 


Homo sapiens 


MOVlO-like 1 


1454 


94 


1870 


gil3603857 


Mus musculus 


MOVlO-like 1 


954 


77 


1871 


AAM96652 


Homo sapiens 


HUMA- Human reproductive system 


484 


96 
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related antigen SEQ ID NO: 5310. 






1871 


oil 8676652 


Homo sapiens 


FLJ00225 protein 


433 


95 


1871 


gi2 1386760 


Berneuxia 
thibetica 


maturase R 


70 


32 


1 872 


AAO90304 
aal 


Homo ^aniens 


NISR Human thryoid peroxidase gene. 


73 


29 


1 87? 


AAW48781 

■TVTV. VV TO /OA 


Homo saoiens 


RSRR- Thyroid peroxidase. 


73 


29 


1872 


AAR75689 


Homo saniens 


NISR Human thryoid peroxidase. 


73 


29 


1873 


AAG03774 


Homo saDiens 


GEST Human secreted protein, SEQ H) 
NO: 7855. 


228 


90 


1873 


ei338288 


Homo sapiens 


preprosomatostatin I 


228 


90 


1873 


gi342299 


Macaca 
fascicularis 


preprosomatostatin 


228 


90 


1875 


AAR30418 


Homo sapiens 


DAND Nearly complete pl07 protein. 


76 


30 


1875 

JLO 1 J 


si347378 


Homo saoiens 


t>107 


76 


30 


1875 


gil 57871 


Drosophila 
mel an o& aster 


P glycoprotein 


76 


24 


1876 


ABB 17955 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 6612. 


186 


40 


1 876 


AAS17764 
aal 


Homo ^aniens 


GENA- Human Genomic DNA for 
CRYBB1. 


167 


39 


1876 
10 /u 




Homo saTiieos 


HYSE- Human polypeptide SEQ ID 
NO 16223. 


165 


42 


1877 


gi|59977|em 

Kir 1 A A 7866 

D|V-»/vrV / OUU 
2.1| 


Human 

PT1 H O 0"PT1 Ol 1 <l 

Tfitrovfms 


tripartite fusion transcript PLA2L 


224 


76 


1878 


ABB84943 


Homo sapiens 


GETH Human PR01556 protein 
sequence SEQ ID NO:254. 


1056 


93 


1878 


AAB31670 


Homo sapiens 


PROT- Amino acid sequence of a 
human protein having a hydrophobic 
domain. 


1056 


93 


1878 


AAB47295 


Homo sapiens 


GETH PR01556 polypeptide. 


1056 


93 


1 870 


ADD 1 J O VI Jl 


Homo <jaTVf»TiQ 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 4518. 


73 


36 


1880 


AAU83117 


Homo sapiens 


ZYMO Novel secreted protein 
Z799543G2P. 


66 


54 


1880 


gil2723186 


Lactococcus 
lactis sub so. 
lactis 


outer membrane lipoprotein precursor 


66 


26 


1881 


ci609624 


Vibrio cholerae 


EpsC 

r 


73 


29 


1882 


gil2667456 


Rattus 
norvegicus 


synapto tagmin "VTId 


86 


32 


1882 


eil2667454 


Rattus 
norvegicus 


synaptotagmin VIIc 


85 


33 


1882 


gi334072 


Pseudorabies 
virus 


ORF-3 protein 


83 


35 


1883 


gil747 


Oryctolagus 
cuniculus 


trichohyalin 


119 


29 


1883 


gi2072290 


Xenopus laevis 


XL-INCENP 


100 


27 


1883 


gil2584554 


Human 

coxsackievirus 

B3 


polyprotein 


96 


25 


1884 


gi|15601413| 
reflNP 2330 


Vibrio cholerae 


sucrose-6-phosphate dehydrogenase 


65 


55 
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1 1 










iOOJ 


gllOo/O^o / 


T-TrYmri CQT*1 PTl C 


Similar to C-terminal modulator protein 


74 


35 


looD 


gli JOOO / 1 4 * 




C-tenninal modulator orotein 


74 


35 


1885 


AAO06984 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 20876 


70 


60 


1887 


AAW25939 


Homo sapiens 


CNRS T-cell receptor V-beta-5.1 
nentide firapment- 


601 


99 


1887. 


gi36973 


Homo sapiens 


T-cell receptor beta-chain 


601 


99 


1887 




jiomo Sapiens 


\/ cpompnt translation nroduct 


600 


100 


1 ooo 
1888 


glloo/440o 


xiomo sapiens 


■nPTtntinnina-Hpfprtivp ^-liVe orotein 
onlipp variant c 


198 


73 


1 ooo 


rri 1 £Ofn B7fl 

giioyujo/u 


xionio sapiens 


narrih'nninff-defeetive 3 -like nrotein 
splice variant b 


198 


73 


looo 


giioyujooo 


xiomo sapiens 


naT+itiAnina-Hpfective 3-1ike orotein 
splice variant a 


198 


73 


1 CQQ 

. loo:* 


cri71 ARQ^77 


Wrtmn cai^iPTiQ 


MAP A nrotein 


1620 


99 


loo? 


tril 1 ZLRO^^O 
glZ I*ro7JJu 


Qpio tflnnic 


MAPA orotein 


833 


56 


1889 


gi21489379 


Mus musculus 


MAPA protein 


630 


48 


1890 


AAY10874 


Homo sapiens 


HUMA- Amino acid sequence of a 
tinman secreted nrotein 


503 


100 






JtValblULUa 

solanacearum 


PROBABLE LIPOPROTEIN 


73 


44 


1891 


gil5723141 


Homo sapiens 


c349E10.1.1 (novel protein, isoform 1) 


180 


46 


1891 




itomo sapiens 


"lit TA/T A » RfPflct anH ovarian cancer 

XI LJ 1VJ_tV. XJi VCto I CHILI U Y CU luXl VCLL1V/VX 

associated antigen protein sequence 
<?FO TD 714 


174 


47 


1891 


gil9353342 


Mus musculus 


RDCEN cDNA 9530058B02 gene 


162 


47 


1892 


AAMoOUoo 


xiomo sapiens 


immune/haematopoietic antigen SEQ 

it/ lNVJ.lJU/7. 


95 


53 


1892 


AAO05973 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 19865. 


94 


82 


1892 


AAO09418 


Homo sapiens 


xl I on- numan poiypcpuue o-c-v^ 
NO 23310. 


91 


70 


1893 


gi8778607 


Arabidopsis 
thaliana 


F5M15.23 


71 


25 


1894 


AAM65951 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 


69 


38 


1894 


AAM53568 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO- 7SrV71 


69 


38 


1894 


gi|20832567| 
refpCP_1335 
24. 1| 


Mus musculus 


similar to Heterogeneous nuclear 
ribonucleoprotein A3 (linRNP A3) 


163 


76 


1895 


AAM66299 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26605. 


440 


83 


1895 


AAM53913 


Homo sapiens. 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26018. 


440 


83 


1895 


gi|6723273|d 
bj|BAA8965 
9.1| 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


270 


45 
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Identity 


1896 


gi4883988 


Bartonella 
clarridgeiae 


cell division protein FtsZ 


UO 


Zo 


1897 


AAO13209 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 27101. 


142 


54 


1897 


AAM66708 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27014. 


124 


46 


1897 


AAM54310 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 

XT/"*. 1£A 1 C 

NO: 26415. 


1 1A 

124 


40 


1898 


gi2565268 


Drosophila 
virilis 


pore-forming protein MIP family 


/D 


77 
Z / 


1898 


gi7453547 


Homo sapiens 


glioma tumor suppressor candidate 
region protein 1 


75 


31 | 


1898 


gi3218331 


Metarhizium 
anisopliae 


nitrogen response regulator 


74 


26 


1899 


gi9656609 


Vibno cnolerae 


chemotaxis protein CheA 


11 


10 


1899 


gi|20908537| 
ref|XP_1274 
14.1| 


Mus musculus 


RKENcDNA 1700001L19 


443 


80 


1899 


gi|15642063| 
refINP_2316 
95.1| 


Vibrio cholerae 


chemotaxis protein CheA 


73 


32 


1900 


gi|18586105| 
ref|XP0914 
00.1) 


Homo sapiens 


similar to seal 


1AO 

203 


o4 


1900 


gi|20888279| 
ref]XP_1465 

AO 1 I 

08.1| 


Mus musculus 


— — . : 

similar to spinocerebellar ataxia type 1 


1 oo 


QO 
SZ 


1901 


gi338033 


Homo sapiens 


serum protein 


90 


32 


1901 


gi4808221 


Homo sapiens 


dJl 17715,2 (serum constituent protein 
MSE55) 


90 


32 


1901 


gi4098993 


Mus musculus 


polyhomeotic 2 


QQ 
oo 


1C\ 


1902 


AAB19933 


Homo sapiens 


INCY- Human oxidoreductase OXRD- 
8. 


250 


100 


1902 


gil9713043 


Fusobacterium 
nucleatum subsp. 
nucleatum 
A ICC 25586 


Iron/zinc/copper-binding protein 


n 
73 


22 


1902 


gi|20342079| 
ref|XP 1106 
14.1| 


Mus musculus 


TJ TTT7AT ~TYk.T A 1 1 f\f\f\(\1T2 1 (L 

KIKhN cDNA l/UOUOiiilo 


77 


O^ 

ZJ 


1903 


gi342279 


Macaca 
nemestrina 


opiomelanocortin 


231 


49 


1903 


gi28342 


Homo sapiens 


proopiomelanocortin 


230 


49 


1903 


gil90183 


Homo sapiens 


opiomelanocortin 


230 




1904 


gi|l 10371171 
0 uia A0974 

85.1|AF194 
537 1 


Homo sapiens 


NAG13 


180 


53 


1905 


gi5360984 


Homo sapiens 


dJ228H13. 1 (similar to Ribosomal 
protein L21e) 


152 


72 


1905 


AAB44126 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1571. 


150 


83 
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1905 


gi5 50015 


Homo sapiens 


ribosomal protein L21 


1 




1906 


gi2654610 


Pseudomonas 
aeruginosa 


arginine/oniithine succinyltransferase 

A T A* -alum 4 + 

Al subumt 


79 


25 


1906 


gil7226812 


Botryotinia 
fuckeliana 


histidine kinase 


72 


33 


1906 


• « /*r\ f\ A A a o 

gil 6904238 


Botryotinia 
fuckeliana 


— — ^ : t~. — rr: 

two-component osmosensing histidme 
kinase BOSlp 


77 


77 

JJ 


1908 


gi330359 


Human 
herpesvirus 4 


nuclear antigen precursor 


01 


in 


1908 


gil632793 


Human 
herpesvirus 4 


EBNA3C (EBNA 4B) latent protein 


91 


37 


1908 


gil 184677 


Candida albicans 


hyphal wall protein 1 


90 


38 


1909 


gil3177635 


Rattus 
norvegicus 


phospholipase C beta-3 


77 
/Z 


7A 
ZO 


1909 


gil 150880 


Mus musculus 


phospholipase C beta3 


/I 


ZD 


1909 


gil7105044 


Simian 
adenovirus 25 


10.1 kDa 


71 


31 


1 A 1 A 

1910 


gi9857054 


Leishmania 
major 


possible CO7055 protem 


71 


A7 


1910 


gil617560 


Leishmania 
major 


LCFACAS5;L5701.2 


67 


33 


1910 


gi|9857054|e 

mb|CAC040 

■fi 1 1 
11.1| 


Leishmania 
major 


possible CG7055 protein 


71 


47 


1911 


AAY87278 


Homo sapiens 


INCY - Human signal peptide 
containing protein HSPP-55 SEQ ID 
NO:55. 


DU1 


07 
OZ 


1911 


A A T* 1 OA 1 O 

AAB18912 


TT ' . 

Homo sapiens 


GETH A novel polypeptide designated 
PR01889. 


501 


OA 

oZ 


1911 


A A T Wl^f A 

AAU27659 


Homo sapiens 


ZYMO Human protem ArP5 1 J45l . 


41o 


1 1 


1912 


gi20652l0 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


434 


Of\ 

oU 


1912 


gi|l 86767 10| 
dbj|BAB850 

AT 1 1 

07. 1| 


Homo sapiens 


FLJ00254 protem 


270 


CA 

04 


1913 


gi5713196 


Caenorhabditis 
elegans 


liprin-alpha homolog SYD-2 


479 


38 


1913 


gi930343 


Homo sapiens 


LAR-interacting protein lb 


467 


39 


1913 


gi930341 


Homo sapiens 


LAR-interacting protein la 


40/ 


70 


1 A 1 A 

1914 


gloo51021 


Mus musculus 


semaphorin cytoplasmic domain- 
associated protein 3B 


77/1 
Z/4 


£1 


1914 


giooDlOly 


Mus musculus 


semaphorin cytoplasmic domain- 
associated protein 3 A 


77zl 
Z 


a"* 
Oj 


1914 


AAM25720 


Homo sapiens 


HYSE- Human protein sequence SEQ 

TT\ VT/"V 1 7*2 < 


266 


61 


1 A 1 C 

1915 


•AAIO 1 A 

gi902214 


Zea mays 


RNA polymerase beta' subumt-2 


/Z 


z*+ 


1915 


gil2482 


Zea mays 


RNA polymerase beta-2 subunit (AA 

I-IDZ/j 


72 


24 


iy id 


glj 1 1 *tu / X 0*T| 

refJNP 0430 
17.1| 


Z-iCu ilia JO 


RNA "nolvmerase beta 1 subunit-2 


72 


24 


1916 


gil655432 


Mus musculus 


plexin2 


1135 


58 


1916 


AAM93435 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3070. 


1132 


57 


1916 


gi961515 


Xenopus laevis 


plexin 


1126 


54 
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No. 


Species 


Description 


Score 


0/ 

Identity 


1917 


gu5559064 


Mus musculus 


CTKT A O 1 

SNACrl 


OO 


JO 


1917 


gi|20863586| 
rel|Xr 1415 
81.1| 


Mus musculus 


similar to dJ551D2.5 (novel protein) 


88 


30 


1917 


gi|18644890| 
ref|NP 5706 
14.1| " 


Mus musculus 


sorting nexin associated golgi protein 1 


86 


38 


1918 


gil9528383 


Drosophila 
melanogaster 


RE04404p 


67 


32 


1919 


AAM77461 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 37767. 


189 


79 


1919 


AAM64684 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 36789. 


189 


79 


1919 


gi|17477135| 
ref|XP_0634 
15.1| 


Homo sapiens 


similar to embryonal stem cell specific 
gene 1 


zoo 


/5 


1920 


gi2623757 


Rattus 
norvegicus 


neurabin 


172 


97 


1920 


gi2827450 


Gallus gallus 


KS5 protein 


154 


88 


1920 


gil3991829 


Xenopus laevis 


neurabin 


145 


83 


1923 


gi5532302 


Heterocapsa 
triquetra 


PSII CP47 apoprotem 


75 


zy 


1923 


gil881335 


Bacillus subtilis 


SIMILAR lO YQrU, YXisJD, Y11J3 
OFB.SUBTTUS. 


68 


38 


1923 


gi|5532302|g 
b|AAD4470 

i.il 


Heterocapsa 
triquetra 


PSII CP47 apoprotein 


75 


29 


1924 


gi6855429 j 


Leishmania 
major 


possible mucin 1 precursor 


77 


33 


1924 


gi5832816 


Caenorhabditis 
elegans 


contains similarity to Pfam domain: 
PF01694 (Rhomboid family), 

O £ 1 T T? - .— 1-- f i ^ 1 c XT 1 

Score=6 1 .7, E-value=5 . 1 e- 1 5, N= 1 


74 


34 


1924 


AAB51976 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 48 oJiQ ID 
NO: 108. 


72 


38 


1925 


A ADC1 C 

AAB51635 


Homo sapiens 


ROSE/ Human secreted protein 
sequence encoded by gene 16 SEQ ID 
NU:75. 




31 


1925 


AAB47128 


Homo sapiens 


INCY- CDEFF-6, Incyte ID No. 
zUU943jL/D1. 


199 


34 


1925 


ABB55766 


Homo sapiens 


rxiCrl/ Human polypeptide o&Q id 
NO 138. 


iy / 


38 


1926 


AA(j89279 


Homo sapiens 


GEST Human secreted protein, bbvi ID 
NO: 399. 


33U 


/I A 

44 




A ADTAjCnA 

AAtS /uoyu 


Homo sapiens 


oKbN- Human nJJrJr protein sequence 
SEO ID NO-7 






1926 


gil3182757 


Homo sapiens 


HTPAP 


319 


44 


1927 


gil3 177290 


Ectocarpus 
siliculosus vims 


EsV-1-8 


69 


36 


1928 


gil 8700171 


Arabidopsis 
thaliana 


AT5g20480/F7C8_70 


86 


39 


1928 


gi915207 


Sus scrofa 


gastric mucin 


83 


29 
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SEQ 
m 

NO: 


Accession 

TNI** 


Species 


Description 


Score 


% 
Identity 






elegans 


linmpnrif rpoifvn mn^t lilce 

JJAJ111CULIV L&qL\J11 UlUOt ujv^/ 

HMPB_DROME: homeotic 
nrnhnseinedia Drotein 


79 


27 


1929 


ABB 12295 


Homo sapiens 


HYSE- Human secreted protein 
hnmnlomie SEO ED NO*2665 


135 


59 


1929 


AAG04080 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO- R161 


78 


38 


1929 


gi9279807 


Drosophila 


cortactin 


77 


27 


1930 


AAV81204_ 
aal 


Homo sapiens 


GEHO Human CD7 cDNA. 


872 


73 


1930 


AAB36657 


Homo sapiens 


IMMV Human CD7 protein sequence 
SEQIDNO:2. 


872 


73 




AAUUz4o6 


xiomo sapiens 


CrPTTO Human Ivmnh nrvte cell surface 

antigen CD7 polypeptide. 


872 


73 


1931 


gi2636248 


Bacillus subtilis 


similar to transaldolase (pentose 

r\ V» A CT> Vl H t"P 1 

pnUdpiia 10 ) 


73 


29 


1931 


gi|21398633| 

rei|INJr 0340 

18.1| 


Bacillus 
aninnicis /Viuiz 


Transaldolase, Transaldolase [Bacillus 


74 


29 


1 OH 


gl|XOUi5v/04| 

refINP_3915 


jd a cuius suouus 


cimilnr tn tranQalHnlaQP fnpnto^P 

f> 1 1 1 1 1 1 <\ * tU LKUidOlUUlaow ^/milV/OV 

phosphate) 


73 


29 1 


1932 


AAB43545 


Homo sapiens 


HUMA- Human cancer associated 

nrntptrt ^emiPTire SFO TD NO'990 


73 


46 


1932 


AAM40234 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
\ro H7Q 


71 


26 


1934 


gi3 129962 


Gallus gallus 


B locus Lectin like Natural Killer cell 
surface protein 


82 


30 


1934 


AAB93791 


Homo sapiens 


T-TCT T Unman nrntpin OPnilPTl f*£* QT^f"^ 

nnjLi- ri union pruicui bcc^uciiuc oi-rv^ 
IDNO:13545. 


77 


38 


1934 


gi2541864 


Drosophila 
meianogasier 


DAD polypeptide 


77 


32 


1935 


gi|4959869|g 
b|AAD3453 


Murine leukemia 
virus 


polymerase 


335 


52 


1935 


gi|6524624|g 

kl A ATM ^fiQR 

•i| 


Phascolarctos 


pol protein 


331 


52 




ef|NPJ)567 
on 11 


vjiooon ape 
leukemia virus 




328 


52 


1936 


gi6562332 


Arabidopsis 

ftia liana 
1 1 In 1 In 1 m 


diaminopimelate decarboxylase 


86 


30 


1936 


gi7573355 


Arabidopsis 
ma liana 


diaminopimelate decarboxylase-like 


86 


30 


1936 


gil5146250 


Arabidopsis 
thaliana 


AT5gll880/F14F18_50 


86 


30 


1939 


AAU07442 


Homo sapiens 


GETH Human Wntl Upregulated 
protein 2 (WUP2). 


300 


100 


1939 


AAU07441 


Homo sapiens 


GETH Human Wntl Upregulated 
protein 1 (WUP1). 


300 


100 


1939 


AAB56802 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1380. 


300 


100 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


otorc 


/o 

Tripnritv 

lUcUUlj 


1 A A A 

1940 


gi5802814 


Homo sapiens 


Lyag-rro-rOi-tsnv protem 


JO/ 


$7 


1 A A A 

1940 


• i 1 of A*)A 

gi41 85939 


Human 
endogenous 
re uo virus jv 


pol protein 




•J 1 


1940 


gi580282l 


Homo sapiens 


Gag-Pro-Pol protein 


586 


57 


1941 


A ATTOOAOO 

AAU83088 


Homo sapiens 


ZYMO Novel secreted protein 

ZjZolZKjJr. 




100 


1941 


AAB20275 


Homo sapiens 


crtrc Unt««ii n\TA y fin 
oL/iic Jtiuman lnteneuKin ujnaa. ou. 


Jjj 




1941 


AAB20277 


Homo sapiens 


SCHE Human interleukin DNAX 80 
variant. 


529 


76 


1942 


AAM06866 


Homo sapiens 


HYSE- Human foetal protein, SEQ ID 
NO: 1074. 


994 


100 


1942 


gil7426446 


Homo sapiens 


bA3 5 1 K23 . 5 (novel protein) 


Oil 




1942 


gil5099951 


Mus musculus 


diacylglycerol acyltransferase 2 


915 


55 


1943 


AAM06596 


Homo sapiens 


HYSE- Human foetal protein, SEQ ID 

xiw lor? 
JNU: 511 . 






1943 


* 1 1 c zr a n a A A I 

gi|l5640499| 

rei|INr_ZJUl 

9£ 1 1 
zu.l| 


: ; 

Vibrio cholerae 


S-adenosylmetliionine synthase 


0/ 




1945 


AAG75561 


Homo sapiens 


HUMA- Human colon cancer antigen 

rtrrvfpin QPO TH *WfYfv$9^ 
proicin OJDV^ UL/ rNW.OJ^J. 


327 


100 


1945 


gil6416764 


Homo sapiens 


FKSG16 


327 


100 


1945 


gil3905212 


Mus musculus 


RJKEN cDNA 1200006F02 gene 


261 


79 


1 A A £ 

1946 


gi288174 


Mus musculus 


UCtZD 


Q7 




1946 


gi53490 


Mus musculus 


Oct2.5 transcription factor 


97 


85 


1946 


gi9937478 


Drosophila 
melanogaster 


thyroid hormone receptor-associated 
protein TRAP 1 70 


72 


39 


1947 


AAM66980 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27286. 


170 


69 


1947 


AAM54574 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ED 
NO: 26679. 


170 


69 


1947 


AAM75189 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 35495. 


159 


or 
OO 


1 A A O 

1948 


AAY10874 


: 

Homo sapiens 


HUMA- Amino acid sequence of a 
numan secreiea protein. 


1 no 

1UU 


1 no 


1949 


AAA 071 CC 

aal 


— : 

Homo sapiens 


Human P9 TYM A 






1949 


AAY94475 


Homo sapiens 


GENE- Predicted translation product of 
human P2 splice isoform, P2-B. 


100 


100 


1949 


AAY94474 


Homo sapiens 


GENE- Human P2 protein. 


100 


100 


1 AC A 

1950 


gi95U20o2 


Homo sapiens 


tubby super-family protein 


so 




1950 


gi9502080 


Mus musculus 


tubby super-family protein 


77 


41 


1950 


gi81 18432 


Oryza sativa 


beta-expansin 


73 


35 


1951 


* A OAOAA A 

gi4808994 


walleye 

f»nirlf»Tm!a1 
cp lUCHllal 

hyperplasia virus 
type 1 


envelope polyprotein 


oy 


40 


1951 


gi|15642893| 
refJNP 2279 
34.1| 


Thermotoga 
maritima 


ribonucleotide reductase, In- 
dependent 


66 


46 


1952 


AAB80264 


Homo sapiens 


GETH Human PR0332 protein. 


577 


61 
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NO: 


Accession 
No. 


Species 


Description 


Score 


o/ 
/o 

Identity 


1952 


AAB33425 


Homo sapiens 


GETH Human PR0332 protein 


577 


61 


1952 


AAY13396 


Homo sapiens 


GETH Amino acid sequence of protein 
PR0332. 


577 


61 


1953 


gil6648392 


Drosophila 
melanogaster 


LD39243p 


449 


61 


1953 


AAG73684 


Homo sapiens 


HUMA- Human colon cancer antigen 
protem SEQ ID NO:4446. 


371 


55 


1953 


AAY48312 


Homo sapiens 


META- Human prostate cancer- 
associated protein 9. 


371 


55 


1954 


AAU84348 


Homo sapiens 


BAAK/ Protein MMP2 differentially 
expressed inbreast cancer tissue. 


2068 


94 


1954 


ABB90738 


Homo sapiens 


UYJO Human Tumour Endothelial 
Marker polypeptide SEQ ID NO 208. 


2068 


94 


1954 


AAB84607 


Homo sapiens 


PFIZ Amino acid sequence of matrix 
metalloproteinase gelatinase A. 


ZUOo 


QA 


1955 


gil6769680 


Drosophila 
melanogaster 


LD46678p 


245 


35 


1955 


AAM66797 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27103. 


1/10 

148 


oU * 


1955 


AAM54396 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 26501. 


148 


80 


1957 


AAB80242 


Homo sapiens 


GETH Human PR0236 protein. 


648 


97 


1957 


AAM93378 


Homo sapiens 


TTT'T T TT t . x » J _ pT;A TT"\ 

HELI- Human polypeptide, SEQ ID 
NO: 2955. 


648 


y/ 


1957 


AAB12157 


Homo sapiens 


PROT- Hydrophobic domam protem 
from clone HP03 165 isolated from KB 
cells. 


648 


AT 

97 


1958 


AAM41696 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6627. 


234 


47 


1958 


AAU17119 


Homo sapiens 


HUMA- Novel signal transduction 
pathway protem, oeq ID oo4. 


229 


46 


1958 


gil6741621 


Homo sapiens 


Similar to RAB37, member of RAS 
oncogene family 


228 


47 


1959 


gil8025526 


cercopithicine 
herpesvirus 15 


LF3 


140 


30 


1959 


gi3153821 


Mus musculus 


plenty-of-prolines-101; POP101; SH3- 
philo-protein 


137 


25 


1959 


gi39255 


Actinomyces 
viscosus 


sialidase 


129 


28 


1960 


ABB 12366 


Homo sapiens 


HYSE- Human bone marrow expressed 
protem oliQ ID NO: 120. 


a A A 

400 


OA 

90 


1960 


AAO 12936 


Homo sapiens 


HYob- rluman polypeptide oiiQ id 
NO 26828. 


1 1 ^ 
113 


0^ 


1960 


A A i rn joao 

AAM84898 


Homo sapiens 


HUMA- Human 

immune/ha ematoooie tic antigen SEO 
IDNO:12491. 


1 1 j 




1961 


gil9110438 


Homo sapiens 


polycystin-lLl 


190 


94 


1961 


gi3115393 


Rana pipiens 


guanylate cyclase inhibitory protein 


80 


35 


1961 


gi3462887 


Rattus 
norvegicus 


alpha-fodrin 


68 


31 ; 


1962 


AAU83130 


Homo sapiens 


ZYMO Novel secreted protein 


1076 


100 
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SEQ 
ID 
NO: 


Accession 
No, 


Species 


JJCoLllJJUUll 


Score 


% 
Identity 








Z835892G6P. 






1962 


gil 890354 


Brassica napus 


L-ascoroate peroxidase 


80 

ou 


+jj 


1962 


gi7529611 


Leishmania 
major 


nypootneucai protein l, i o / ,uo 


7Q 


^ i 


1963 


AAG78679 


Homo sapiens 


D\JDct- Jtiuman mromDouc pruicixi to. 






1963 


AAY87347 


Homo sapiens 


INCY- Human signal peptide 
contammg pro tern rio.r ir - 1 z*t ony xu 
NO:124. 


467 


86 


1963 


AAB01431 


Homo sapiens 


MILL- Human i ainou zz*t jiorm /.). 


H\) 1 


R6 


1964 


gi3413504 


Rattus 
norvegicus 


Bassoon 


81 


26 


1964 


gi330452 


human 
herpesvirus 5 


DNA polymerase 


79 


28 


1964 


AAV69717_ 
aal 


Homo sapiens 


LUDW- Tumour rejection antigen 
precursor MAGE-C1 cDNA. 


73 


33 


1965 


gi|2323'287|g 

b|AAB6652 

8.1| 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


zoo 




1965 


gi|2351212|d 
bj|BAA2206 
4.11 


Friend murine 
leukemia virus 


gag-pol polyprotein (precursor protein) 


179 


47 


1965 


gi|9629516|r 
efJNP 0447 ! 
38.1| 


Rauscher murine 
leukemia virus 


Pol 


1 *70 

i /y 


AO 


1966 


gi|2323287|g 

blAAB6652 

8.1| 


multiple 
sclerosis 
associated 
retrovirus 


polyprotein 


Ana 
4/0 


03 


1966 


gi|2281588|g 

b|AAB6416 

0.1| 


synthetic 
construct 


Pol 


515 


CI 

31 


1966 


gi|9626961|r 
eflNP 0579 
33.1| 


Murine leukemia 
virus 


Prl80 


ooi 
52.5 


C 1 

J 1 


1967 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


015 


ID 


1967 


AAM65715 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26021. 


464 


69 


1967 


AAM53338 


Homo sapiens 


MOLE- Human brain expressed single 
exon prooe encooea protein o&\i ±u 
NO: 25443. 


464 


69 


1968 


AAG78149 


Homo sapiens 


BODE- Human polypeptide- 
cytochrome b5-l 3. 


388 


82 


1968 


gi3 150438 


Human 
endogenous 
retrovirus K 


pol-env 


345 


55 




m\ A6Q0A0 

giiwyz** D 


Human 

endogenous 
retrovirus K 


pui/ CUV 


345 


55 


1969 


gi21113108 


Xanthomonas 
campestris pv. ^ 
campestris str. 
ATCC 33913 


TonB-dependent receptor 


78 


31 



WO 03/080795 



PCT/US02/25485 



207 

Table 2 



SEQ 
ID 
NO: 


Accession 
No. 


Species 
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1969 


gi476274 


Homo sapiens 


R kappa B 


77 


23 


1969 


gi4206769 


Acanthamoeba 
castellanii 


myosin I heavy chain kinase 


/ o 


77 

1 


1970 


gi|13310191| 
gb|AAK181 
89.1|AF331 
500_1 


multiple 

sclerosis 

associated 

retrovirus 

element 


recombinant envelope protein 


244 


77 


1970 


gi|8272468|g 
b|AAF74215 
.1|AF15696 
3 1 


Homo sapiens 


envelope protein 




01 


1970 


gi|21 103962| 
gb|AAM331 
41.1| 


Homo sapiens 


envenn-2 


01 Q 


77 


1971 


AAU83621 


Homo sapiens 


GETri Human rKU protein, oeq lu jno 
60. 


390 


L\J\J 


1971 


AAO05826 


Homo sapiens 


HYSE- Human polypeptide 6Jiv^ lu 
NO 19718. 






1971 


AAM39560 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2705. 


194 


56 


1972 


gi64561 12 


Mus musculus 


t~> * - . - * 'CDV1 C 

F-box pro tern rrJAi j 


19R 


44 


1972 


gi21428946 


Drosophila 
melanogaster 


GH22104p 


74 


31 


1972 


gi|6456112|g 
b|AAF09139 
.11 


Mus musculus 


F-box protein FBX1 5 


128 


44 


1973 


gil48270 


Escherichia coli 


lambda-integrase 






1973 


gil790244 


Escherichia coli 
K12 


site-specific recombinase, acts on cer 
sequence of ColEl, effects 
chromosome segregation at cell 
division 


550 


94 


1973 


gil3364217 


Escherichia coli 

/M C*^ TT1 

0157:H7 


site-specific recombinase XerC 




Q9 


1974 


gil805552 


Escherichia coli 


FORMATE HYDROGENLYASE 

TO AXTCf^DTDTTrVNTAT APTTVATfYR 


887 


88 


1974 


gil616960 


Escherichia coli 


HyfR 


887 


88 


1974 


gi7920396 


Salmonella 
typhimurium 


formate hydrogenlyase activator 
protein 


522 


54 


1975 


gi409795 


Escherichia coli 


No definition line tound 


1 17^ 


00 


1975 


gil5074592 


Sinorhizobium 
meliloti 


HYPOTHETICAL 
TRANSMEMBRANE r KU 1 iilJN 


378 


33 


1975 


gil7740718 


Agrobacterium 
tumefaciens str. 
C58 (U. 
Washington) 


Na+/Pi-cotransporter 


372 


34 


1976 


AAB82047 


Homo sapiens 


IGAK- Human mast cell surface 
antigen. 


163 


23 


1976 


gil2654783 


Homo sapiens 


Similar to loss of heterozygosity, 11, 
chromosomal region 2, gene A 


163 


23 


1976 


AAZ45690_ 
aal 


Homo sapiens 


REGC cDNA sequence encoding the 
human minor vault protein p 1 93 . 


108 


25 


1977 


ABB56523 


Homo sapiens 


MERI Human NMD A receptor subunit 
SEO ID NO 44. 


73 


28 
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1977 


AAW87504 


Homo sapiens 


SEBI- Human N-methyl-D-aspartate 
reccpior buouiiii cuvuu.ou uy wuut 
NMDA24. 


73 


28 


1978 


AAGO0471 


Homo sapiens 


/TPQT TTnman cp/*r*»tpf1 TvrntPltl ST*.{*) fF) 

NO: 4552. 


285 


93 


1978 


gi298489 


Papio hamadryas 


SP-10 


133 


34 


1978 


gi452582 


Vulpes vulpes 


iox sperm acrosomal protein r o/v- 
Acr.l 


132 


34 


1979 


AAB87128 


Homo sapiens 


MILL- Human secreted protein 

MAJNOU J4^, orVVc JLLJ lH\J.Lj\J. 


490 


86 


1979 


AAB87179 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349 12 IK variant, SEQ ID 


488 


85 


1979 


AAB87181 


Homo sapiens 


MILL- Human secreted protein 
MANGO 349 E41D variant, SEQ ID 


487 


85 


1982 


AAM75035 


Homo sapiens 


iVlwJ_fJjf- riuijjqn ouuc xiiaiiuw 
expressed probe encoded protein SEQ 


109 

IV/ 


67 


1982 


AAM62231 


Homo sapiens 


MOLE- Human brain expressed single 
exon pro dc eiiwuucu piutciu ouy xxj 


109 


67 


1982 


gil 1967423 


Mus musculus 


vomeronasal receptor V1RC5 


105 


76 


1983 


AAG89276 


Homo sapiens 


fTEQT Unmon Qf>nTt>t(*A nrnfpin ^FO IP) 

vjrjio 1 riurnan secreicu pruiciii, oc\£ xu 
NO: 396. 


224 


46 


1983 


AAB56565 


Homo sapiens 


iCUoH/ ritiman prostate cancer anugcn 
j>rotein sequence SEQ ID NO: 1 143. 




40 

"TV 


1983 


AAY44987 


Homo sapiens 


INCY- Human epidermal protein-4. 


78 


28 


1984 


AAB95089 


Homo sapiens 


rlbL»l- Human protein sequence o&\i 
ID NO: 17025. 


AOS 


07 


1984 


AAM06608 


Homo sapiens 


HYSE- Human foetal protein, SEQ ID 


495 


96 


1984 


gi497890 


unidentified 

nitrogen-fixing 

bacteria 


alpha subunit of dinitrogenase 
reductase ^re protein; 


73 


24 


1985 


gl|l /4Dj/zo| 

94.11 


xiomo sapiens 


o-J»Yii1eiT +rt 7 , inp_'fiTiorpr TvrrttMTl lini-n4 

rRpniiiPTin^ f AnrvntoQi*; re*monse zinc 

finger protein) 


71 


37 


1986 


gi21428886 


Drosophila 
meianogasicr 


GH12469p 


69 


34 


1987 


gi7767529 


Bos taurus 


cyclophilin I 


364 


75 


1987 


gi8699209 


Cards familiaris 


cyclophilinA 


361 


88 


1987 


gil 1641 132 


Sus scrofa 


cyclophilin 


361 


88 


1988 


gil5073168 


Sinorhizobium 
meliloti 


PROBABLE TRANSLATION 

TXTTTT A TT/^XT X? A fTAD TT7 O 

UNlllAllUfN MLlUKlr-/ 
PROTEIN 


81 


37 


1988 


gill81352 


Paramecium 

burs aria. 

Chlorella virus 1 


Pro-rich protein; PIPG (8X) 


78 


25 


1988 


gi493242 


Feline 

herpesvirus 1 


Feline herpesvirus type 1 immediate 
early protein 


77 


20 


1989 


AAM65707 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 26013. 


134 


66 
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1989 


AAM53330 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 


134 


66 


1989 


gi|20475216| 

reflXP_1148 

no 11 
UZ.1| 


Homo sapiens 


similar to synapsin I 


228 


59 


1990 


AAM71181 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 

LD JNU. jlHof. 


110 


64 


1990 


AAM58674 


: 

Homo sapiens 


jyiujui}- xiuman Drain cxpicoocLi diugic 
exon probe encoded protein SEQ ID 
NO- 30779 


110 


64 




glZijZ.30.30 


i^oryncodt/ ici luni 
glutamicum 
ATCC 13032 


Cnlfofp nprmpasp and related 

transporters (MFS superfamily) 


75 


26 


1 QQ1 
lyy 1 


glljOZOU 


Vpnnr»iic lap vie 
YvCIlUpUo laOVlo 


HqPMA a d en n sine deaminase 


96 


34 


1991 


AAE10203 


Homo sapiens 


HYSE- Human bone marrow derived 
contig protein, SEQ ID NO: 68. 


83 


25 


1 Q01 


rrl^O/IOAAO 

guz^zwy 


ivaXLa CalCSDelallii 


alrVhn 1 tvnp T rnllflD'Pn 

alalia 1 type 1 L'LFliagt'll 


80 


30 


1992 


gill81423 


Paramecium 
bursaria 

L,nioreua virus i 


PBCV-l cbitinase 


71 


41 


1992 


gi|21300897| 
42.1| 


Anopheles 
gamuiae sir. 
PEST 


agCP14405 


72 


37 


1 000 

lyy 'Z 


ml0£11 BOBlr 

gl|yuj loZo|r 

ef|NP 0486 
13.1| 


rajameciuni 
bursaria 
Chlorella virus 1 


rJDv v i Vimmioap 


71 


41 


1994 


gl0248 IJJ 


Plasmodium 
falciparum 3D7 


pro lein pnospimtasc 


72 


25 


1994 


gi4104348 


Campylobacter 
rectus 


S-layer-RTX protein 


70 


38 


1994 


gi|8248755|e 
mb|CAB628 

/o.Z| 


Plasmodium 
falciparum 3D7 


protein phosphatase 


72 


25 


1995 


gi21324402 


Corynebacterium 
glutamicum 


Uncharacterized ATPase related to the 
helicase subunit of the Holliday 

JLLLXvliUil loouivaoc 


73 


38 


1 00^ 
lyy j 


^MQCCORA^I 
glliyDDZoH j| 

ref|NP_6008 


\^oryneoacienum 
glutamicum 


PHfrO^Sfi'TInrharacterized ATPase 
related to the helicase subunit of the 
Hnllidav i unction res ol vase 

A AWAIA\A**Jr J UUwUvll m. wiJUA T uow 


73 


38 


1 00^ 
lyy j 


frill 7^191 31 
gl|l / JJJ£L J\ 

reflNP 4957 
77.1| 


V^aCUUlilaUUlllb 

elegans 


F14E5 5 n 


73 


30 


1 OOA 
lyyO 


gllo / 1ZZJ 


iviCKeusia lypni 


orvctnllinp dirfflrp lavpr Tvrotein 

ivXyoUUJJAlw O 111 lAvv lAjrwI iwAAA 


92 


30 


1996 


gi6969926 


Rickettsia 
aeschlimannii 


OmpB 


79 


25 


1996 


gil4670347 


Rickettsia felis 


OmpB 


78 


25 


1997 


gi|20548733| 
refpOP 0556 
41.2| 


Homo sapiens 


similar to gag protein 


256 


58 


1997 


gi|9739120|g 
b|AAF97916 

•1| 


Bovine leukemia 
virus 


gag 


186 


34 
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1997 


gi|9626226|r 

ef|NP_0568 

97.1| 


Bovine leukemia 
virus 


Pr44 


IRS 

1 O J 


34 


1998 


AAM79834 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3480. 


279 


71 


1998 


AAM78850 


Homo sapiens 


HYSE- Human protein lu inij 
1512. 


070 


71 


1998 


AAM79204 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1866. 


272 


71 


1999 


AAM73176 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 33482. 


168 


48 


1999 


AAM60521 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein buy UJ 
NO: 32626. 


168 


48 


1999 


gi|13929148| 
refJNP 1139 
97.1| 


Rattus 
norvegicus 


cyclic nucleotide-gated channel beta 
subunit 1 


163 


47 


2000 


gil869859 


human 
herpesvirus 2 


very large tegument protein 


73 


30 


2000 


gi7380253 


Neisseria 

meningitidis 

Z2491 


2-keto-4-hydroxyglutarate aldolase 




37 


2000 


gi7226633 


Neisseria 

meningitidis 

MC58 


4-hydroxy-2-oxoglutarate aldolase/2- 
deyoro- 3 -ueoxypnospnogiuc on& ic 
aldolase 


70 


37 


2001 


#17016969 


Mus musculus 


NUANCE 


138 


36 


2001 


gi6273778 


Homo sapiens 


trabeculin-alpha 


137 


33 


2001 


gil675222 


Mus musculus 


ACF7 neural isoform 1 




42 


2002 


AAM39256 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2401. 


81 


29 


2002 


8*840789 


Homo sapiens 


binding regulatory factor 


01 


29 


2002 


gil7028337 


Homo sapiens 


regulatory factor X, 5 (influences HLA 
class II expression) 


81 


29 


2003 


gi2252814 


Mus musculus 


FOG 


172 


64 


2003 


AAR58815 


Homo sapiens 


USSH Human c-myc tar upstream 
element {b UoH) oincung protein 
(FBP)variant from HL60 clone 3-1 . 




42 


2003 


gi3598974 


Rattus 
norvegicus 


protein tyrosine phosphatase TD14 


103 


26 


2004 


gill 994696 


Arabidopsis 
thaliana 


contains similarity to DNA repair 
protein-gene ia:K/Mz.ii 


77 


28 


2004 


gi7209527 


Mus musculus 


testis-specific _gene 


73 


24 


2004 


gi|17451912| 
refpCP_0710 
83.1| 


Homo sapiens 


similar to DNA-binding protein JtJ 




97 


2005 


AAE 12023 


Homo sapiens 


INC Y- Human o-protein coupiea 
receptor, GCREC-2. 


171 


100 


2005 


AAG65832 


Homo sapiens 


FARB Human G protein-coupled 
receptor (GPCR). 


173 


100 


2005 


AAG68126 


Homo sapiens 


FARB Human 7TM-GPCR protem 
sequence SEO ID NO:6. 


105 


78 


2006 


gi20068811 


Homo sapiens 


Rab-coupling protein 


130 


43 


2006 


gil5822596 


Homo sapiens 


nRipll 


104 


45 
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Identity 


2006 


gil3377897 


Homo sapiens 


Rabl 1 interacting protein Rip 11a 


83 


40 


2007 


gi|17539708| 
ref|NP_5014 

OA 1 1 

oy.l\ 


Caenorhabditis 
elegans 


T7ACD/1 K -n 


7R 
/o 


49 ! 


2008 


AAE1O350. 


Homo sapiens 


PFIZ Human ADAMTS-J1 .4 variant 
protein. 


504 


97 


2008 


AAE10349 


Homo sapiens 


PFIZ Human ADAMTS-J1.3 variant 
protein. 


504 


97 


2008 


AAE10347 


Homo sapiens 


PFIZ Human ADAMTS-J1 . 1 variant 
protein. 


504 


97 


2009 


AAV31720_ 
aal 


Homo sapiens 


MOUN Nucleotide sequence of the 
rUK-aipna gene. 


87 


29 


2009 


AAT99264_ 
aal 


Homo sapiens 


MOUN Human PUR-alpha gene. 


87 


29 


zuuy 


aal 


Homo sapiens 


iviljuin .encodes smgie-siranaea jlmn/\ 
binding (PUR) protein. 


R7 
o / 




2010 


gil70444 


Lycopersicon 
escmentum 


extensin (class II) 


123 


27 


2010 


gi4662641 


Arabidopsis 
thaliana 


expressed protein 


116 


30 1 


OA1 A 

2010 


gil88864 


Homo sapiens 


mucin 


1 1 ^ 

113 


zo 


20ll 


AAY93650 


Homo sapiens 


HUMA- Amino acid sequence of a 
human prostacyclin-stimulating factor- 
z. 


1677 


100 


201 1 


A A C 1 CTO^l 

AAM5723_ 
aal 


. 

Homo sapiens 


CURA- DNA encoding insulin-like 
growth factor family related protein, 

\S\J V J. 


10 / j 


QO 


2011 


AAE17599 


Homo sapiens 


INCY- Human extracellular messenger 
{/uvldoj-i protem. 


1673 


99 


2012 


gil0440434 


Homo sapiens 


FU00052 protein 


336 


69 


2012 


gi20502870 


Mus musculus 


SDS3 


111 


Oo 


2012 


gi21430678 


Drosophila 
melanogaster 


RE74901p 


170 


36 


2013 


AAH77293_ 
aal 


Homo sapiens 


T TT '• U 1 1 

MILL- Human ion channel protem 
IC32391 cDNA coding region. 


91 A 


07 




A ATJ1 lO^Q 


riomo sapiens 


iiNL-/ y - unman transpoixers ana ion 
channels (TRJCH)-5. 


91d 
Z l*t 


Q! 


Z\JLD 




xiomo sapiens 


ivjuuljl- nimiaii iun uiiaiiiiCA piuioin 
IC32391. 


914 

Alt 




2014 


gi4894768 


Xenopus laevis 


ephrin-B2 precursor 


78 


30 


2015 


AAU / /4ys 


Homo sapiens 


uno i - xiuman npia meiaooiism 
enzyme, LMM-6. 


1901 

1Z7 1 


inn 


2015 


ABB08205 


Homo sapiens 


INCY- Human lipid metabolism 
enzymeo (LMlio). 


1122 


100 


2015 


ABB07493 


Homo sapiens 


INCY- Human lipid metabolism 
molecule (LMM) polypeptide (ID: 
2965233CD1). 


864 


75 


zuio 


trill 47^001 ^1 
gl|l*f/0>Ui J| 

refpCP 0415 
69.1| 


nomo sapiens 


fibrillin! 


68 


36 


2017 


gi2313786 


Helicobacter 
pylori 26695 


chorismate synthase (aroC) 


78 


33 


2017 


gi4155160 


Helicobacter 
pylori J99 


CHORISMATE SYNTHASE 


72 


32 
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2017 


gi|15645287| 
rei|Nr_2074 
57.1 1 


Helicobacter 
pylon zooy o 


chorismate synthase (aroC) 


78 


33 


2018 


gil5485622 


Homo sapiens 


09H4T4 like 


1068 


100 


2018 


ABB 14744 


Homo sapiens 


XTT TA/T A TJn-mcm riprvmic QVQtRTTI related 

polvoeDtide SEO ID NO 3401. 


694 


98 


2018 


AAB95100 


Homo sapiens 


HELI- Human protein sequence SEQ 

JLU INU. 1 /UO*h 


101 


24 


2019 


gi8050556 


: : 

Gorilla gorilla 


carDOxyi-esier lipase 


223 


42 


2019 


AAU09894 


Homo sapiens 


MONS Bile Salt Stimulated Lipase 
03 SSL). 


217 


39 


2019 


ABB04676 


Homo sapiens 


MUJNb Human miiK Due salt- 
stimulated lipase (BSSL) protein SEQ 
IDNO:2. 


917 
£.11 


39 


2020 


gi2065210 


Mus musculus 


Pro-Pol-dUTPase polyprotein 


515 


74 


2020 


gi|385615|gb 
|AAB26708. 


Mus sp. 


fibulin gene homolog 




7<? 


2020 


gi|13194728| 
gb|AAK155 
26.1|AF329 
451 1 


Gallus gallus 


pol-like protein rsiNb-J 


170 


j j 


2021 


AAM66980 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 

JLU INU. Z/ZoO. 


170 


75 


2021 


AAM54574 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoaec protein oeki ulj 
NO: 26679. 


170 


75 


2021 


AAM75189 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
jjj inu: j j4zO. 


1 50 

1 


00 


2022 


AAD29146_ 
aal 


Homo sapiens 


ZYMO Human Zcyto21 consensus 

CUInA. 


649 


83 


2022 


A A T TOIOAO 

AAU83208 


: 

Homo sapiens 


^ i jvlu XNOvei secreieu proieiu 
Z908463G2P. 


AAQ 


83 


2022 


AAE18311 


Homo sapiens 


Z/ 1 jvlu xiuman z^cyrozi consensus 
protein. 




83 


2024 


gil4336750 


Homo sapiens 


Ce protein similar to Dm Cys3His 
nnger protein 


84 


34 


2024 


AAB50363 


Homo sapiens 


UYSL- Human SRCAP. 


83 


34 


2024 


AAB93541 


Homo sapiens 


TXE7T T ITiiman nrnfpin cpnilPnfP 

rxcJL#i- riuiiiiiii pro ic 111 oct^ucu^c oi_>v< 

IDNO:18149. 


83 


34 


A A1 C 

2025 


giloo/oooz 


Homo sapiens 


i*j-JUU^*tu pruicui 


470 


45 


2025 


gil4701866 


Dictyostelium 
discoideum 


carmil 


221 


29 


2025 


gil881738 


Acanthamoeba 
castellanii 


myosin-I binding protein Acanl25 


219 


29 




ARB! 2490 


Homo saniens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 329. 


212 


78 


2027 


AAU83147 


Homo sapiens 


ZYMO Novel secreted protein 
Z846363G2P. 


1153 


100 


2027 


gi|21287755| 

gbJEAAOOO 

76.ll 


Anopheles 
gambiae str. 
PEST 


ebiP4780 


205 


51 
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2027 


gi|17552028| 
ref]NP_4984 
0/.l| 


vJaenornaDaiuS 
elegans 




91 


38 


2028 


gil510143 


Homo sapiens 


similar to C.elegans protein encoded in 
cosmid T20D3 (Z68220, 


323 


57 


2028 


gi3879942 


Caenorhabditis 


T20D3.11 


124 


27 


2028 


gi5869818 


Globodera 
pallida 


NADH-ubiquinone oxidoreductase 
subunit 6 


82 


27 


2029 


A A T~? 1 OIOO 

AAE13250 


Homo sapiens 


TWPV- Wii-man trancnnrtfiix and ion 

channels (TRICH)-15. 


75 


31 


2029 


gi3252893 


inermotoga 
neapolitana 


Adl uansporici 


74 


37 


2029 


gi|l 8403965 1 
ret|JNr jODo 

26.1| 


Arabidopsis 

UJiiJLLaXLcl 


PvnrACCPn TYfV\tP1Tl 

cxprcoscu piuicui 


70 


29 


2030 


AAB97908 


Homo sapiens 


SHAN- Human GTP-binding protein 
17 SFO TDNO-2 


79 


27 


2030 


AAM42129 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 7060. 


79 


27 


2030 


gl9971156 


Mus musculus 




79 


27 


2031 


gi|20864803| 
refpO»_1308 
00.1 1 


Mus musculus 


RIKEN cDNA 4930503K02 


89 


25 


2031 


gi|21262152| 
emb|CAD32 
690.11 


Oryza sativa 


SMC4 protein 


77 


28 


2031 


gi|1507705|g 

blAAB0656 

8.1| 


Borrelia 
burgdorferi 


outer suriace proicm 


74 


33 


2032 


AAG65898 


Homo sapiens 


SMIK Amino acid sequence of GSK 
gene Id 18525. 


481 


100 


2032 


A ATTO^iC7A 

AAUoJo/U 


riomo sapiens 


(TFTR "Human P"RO nrotein Sea ID No 

VjXvXXx XX u Allan x xv.v»/ piui^iU) uv-^ 

158. 


471 


97 


2032 


A T)T> Oyl OO/C 

ABi3o459o 


xiomo sapiens 


OFTTT Human PR 01^09 nrotein 

VJXj X XX XXUXXuUX X J\v X — » ^/x V/ iwiu 

sequence SEQ ID NO: 160. 


471 


97 


2034 


gio/23273 


Baboon 
endogenous 
virus strain M7 


era a rvr\1 r^rpr»nr«ri'r nnlvnrfitpin 
gag~J/Ul uituuxoui ynJiy lxluiwixx 


687 


43 


2034 




AAnlAnaYr murine 

jvioioney munnc 
leukemia virus 


Pr1 Rfl aa a-nrn-nnl nolvnrotein 


685 


42 


2034 


gi2801471 


Moloney n&Urine 


Prl80 


682 


42 


2035 


gi|17554696| 
ref]NP 4976 
70.1| 


Caenorhabditis 
elegans 


R148.7.p 


68 


32 


2035 


gi|l6l27996| 
ref]NP 4145 
43.ll 


Escherichia coli 
K12 


aspartojanase i, nomosenne 
dehydrogenase I 


68 


43 


2035 


gi|l9548975| 
gb|AAL908 
85.l|AF487 
900 l 


Escherichia coli 


aspartokinase I-homoserine- 
dehydrogenase I 


68 


43 


2036 


gil3424459 


Caulobacter 


methyl-accepting chemotaxis protein 


72 


32 



WO 03/080795 PCT/US02/25485 



214 
Table 2 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 






crescentus CB15 


Mcpl 






2036 


•1-4 e\' l» 1 « r\ 1 

gi|16877133| 
gb|AAH168 
38.l|AAH16 
838 


Homo sapiens 


carDoxypepuuase, viieuugciiiu-iiA.c 


69 


30 


2037 


AAB67055 


Homo sapiens 


INCY- Human immune response 
molecule ^uviurNj proiem oiiv^ u-' r*vy. 
9. 


532 


75 


2037 


AAO01862 


Homo sapiens 


HYSE- Human polypq)tide SEQ ED 


403 


67 


2037 


gi|6753924|r 
efINP_0343 
74. 1| 


Mus musculus 


t2t—.J j-**** *-3 ir-i+^ito nnPPDtxfini lift/ 1 

rnena virus suscepuouiiy i 


240 


39 


2039 


AAB38447 


Homo sapiens 


HUMA- Fragment of human secreted 
protein encoded by gene 20 clone 
rxxjrD 1 1 j. 


80 


27 


2039 


gill 527799 


Mus musculus 


GTP-binding protein like 1 


73 


30 


2039 


gi695237 


Equine 
herpesvirus 2 


tegument protein 


73 


33 

* 


2040 


gi|20544038| 
reflXP_0896 
I2.4| 


Homo sapiens 


similar to PER-HEXAMER REPEAT 
PROTEIN 5 


68 


41 


2042 


AAM77922 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 

xU JNU. OOZZo. 


642 


85 


2042 


AAM65219 


Homo sapiens 


MOLE- Human brain expressed single 
exon prooe encoaea proiem ony iu 

XTfV ITXOA 

NU: 3/jZ4. 


642 


85 


2042 


gi|6723273|d 
bjpAA8965 
9.1| 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


139 


26 


2043 


gi48507 


Wolinella 
succinogenes 


tormate aenyarogenase 


80 


27 | 


2043 


gu238l857 


Danio rerio 


c-Maf 


78 


42 


2043 


gi|18594822| 
reijXP 092*/ 
95.11 


Homo sapiens 


zinc finger protein 21 (KOX 14) 


306 


100 


t\A A 

2044 


gliliiZ/Z 


Sus scrofa 


AXTTI VinmnlncniP 


99 


47 


2044 


AAG78446 


Homo sapiens 


MASI Predicted WT1 Wilm's tumour 
polypeptide of humans. 


96 


45 


2044 


AAG62154 


Homo sapiens 


CORI- Human WT1/PSA fusion 
proiem oj&v • UL ' ^ j 1 • 


96 


45 


2046 


gl2 1483222 


r^j 

Drosopnila 

melanogaster 


AT1 AQQArt 


86 


33 


2046 


gi2llll736 


Xanthomonas 
campestris pv. 
campestris str. 
ATCC 33913 


cell division proiem 


79 


30 


2046 


gil2653493 


Homo sapiens 


Similar to brain acid-soluble protein 1 


79 


36 


2047 


ABB12490 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 329. 


200 


83 


2047 


gi|20837783| 
repP 1459 
21.11 


Mus musculus 


similar to 40S ribosomal protein SI 1 


73 


35 
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Table 2 



ID 

NO: 


Accession 
No. 


Species 


Description 


ocore 


0/ 

/o 

Identity 


ZU4/ 


rrIl£AfV)O10lrr 

gl|oUU2ioz|g 
.1|AF16496 


Streptomyces 

fro Hi #i a 


glycosyl transferase 


T1 

/ 1 


^s 

• 


704R 




xiomo sapieiib 


Xl \J IVJLrV - JDICaoL allU UVallaXI 1/dJLLvCl 

associated antigen protein sequence 
SEQ ID 720. 






2048 


gi2429362 


Santalum album 


proline rich protein 


99 


31 


TA/1Q 

ZU4o 


«~J1 TO/1 

gll /y453oz 


Drosophila 
melanogaster 


KJSl / 103p 




9S 
Zj 


Z051 


gll 5 625 542 


Hepatitis B virus 


S antigen 


T1 
/l 


j1 


2051 


•I/ion ylQQ 

b|AAD3185 

T 11 A T71 1/11 

/.l|Arl341 

40 1 


Hepatitis B virus 


surface antigen 


Oo 


7A 


TA<9 


AAUZo /04 


Homo sapiens 


xiuiVLA.- ocquciicc nomoiogous 10 
protein nagmeni cncoaca uy gene z i . 


020 


78 
to 




glZUO^Z 1U 


lvius muscuius 


IT Itr-X^01~LIU li abC pUiypiUlCUUl 




78 


2052 


AAB73606 


Homo sapiens 


SHAN- Human dUTP pyrophosphatase 


668 


77 


2053 


gi9945983 


Pseudomonas 

O prii rr-i nncQ 

d-CI UglilUoa 


transcriptional regulator PcaQ 


83 


34 


2053 


gil3874427 


Homo sapiens 


cerebral protein-5 


76 


35 


2053 


gil2803205 


Homo sapiens 


CAAXboxl 


76 


35 


ZUD4 


glZ13U/o31 


Aplysia 
califomica 


i^KJbJts - Dinaing protem 


/O 


zo 


Z104 


gllD/DDoo / 


Drosophila 
melanogaster 


guanine nucleotide exchange factor 


t£ i 
/o 


TA 
ZO 


ZUj4 


m 10 HATCH 1 
gl|Z13U/o31| 

ctKIA AT SAft 

59.11 


Aplysia 
caiiioniica 


L^jvJDt5~DinniTig proxem 


TA i 


ZO 


2055 


gil6588389 


Homo sapiens 


B lymphocyte activation-related protein 

pf 1 CI A 


437 


71 


2055 


AAB92981 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 11698. 


407 


68 


2055 


AAM48325 


Homo sapiens 


SHAN- Human purine receptor 21.23. 


398 


74 


2056 


gi|2072969|g 

b|AAC5127 

4.1| 


Homo sapiens 


p40 


1 1A 

134 


/IT 

4/ 


zUOO 


gl|7959889|g 
b|AAF71115 

1 1AT71 I^TT 
. l|Ar 1 10/Z 

1 OS 


Homo sapiens 


PR02221 


1 TO 

123 


/ti 
43 


2056 


gi|2072974|g 

VilA APS19T 
U|/\AV^J 1Z / 

7.11 


Homo sapiens 


p40 


122 


44 


2057 


gil9171178 


Homo sapiens 


metalloprotease disintegrin 16 with 
thrombosDondin tvoe I motif 


518 


98 


2057 


gil9171150 


Homo sapiens 


ADAMTS1 8 protein 


168 


35 


2057 


AAM39212 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2357. 


128 


76 


2058 


gi|4959869|g 

b|AAD3453 

6.1| 


Murine leukemia 
virus 


polymerase 


336 


50 



WO 03/080795 PCT/US02/25485 



216 
Table 2 



SEQ 
ID ! 


Accession 
No. 


Species 


Description ! 


Score 


% 
Identity 


2058 


gi|9630313|r 
ef|NP_0567 
on ii 


Gibbon ape 
leukemia virus 


pol polyprotein 


331 


46 


2058 


gi|6723273|d 
bj|BAA8965 
9.11 


Baboon 
endogenous 
virus strain M7 


gag-pol precursor polyprotein 


329 


49 


2059 


gl[ZU04o4U4| 

refpCP 1164 
66.1| 


xiomo sapiens 


cimilnr tri miHftar receiptor coactivator 
4; RET-activating gene ELE1 


179 


91 


2060 


gi|6731237|g 
b|AAF27177 
,1|AF18231 

7 1 


Homo sapiens 


myoicrim 


112 


79 


zuou 


rril7Q°/7QQlcrh 

gi|/yo/yy[go 
IAAOV7711 

1| 


TV/Tno mil cpii In c 


immunoglobulin heaw chain 


72 


55 


2060 


gi|20819487| 
reflXP 1453 
57.1| ~ 


Mus musculus 


similar to LYRIC 


72 


27 


ivul 


pi415738 


Eut?lena gracilis 


PSII Dl-polypeptide 


75 


27 




cnl 1491 


Eufflena gracilis 


32 kd protein 


75 


27 


2061 


gill488 


Euglena gracilis 


32-Kda thylakoid membrane protein 


75 


27 


?n£7 




A raHi H nn<ii <i 

thaliana 


AT3e01480/F4P13 3 


79 


29 


7firV? 


gl^*>.J /JUU 


A raVviHr»n<!i<! 

thaliana 


nodulin-like protein 


68 


36 




cri7QSQ77ft 




PRO 1546 


121 


42 


2063 


AAG02639 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO* 6720 


119 


53 


2063 


AAG02753 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO- 6834 


110 


45 


ZUO*f 


gll JU / / WO 


/VQUlCiaCa 

vamamai 


filrrriiTi 


109 


30 


2064 


AAB82806 


Homo sapiens 


BOST- Human low density lipoprotein 
binding protein 2 (LBP-2). 


92 


24 


2064 


AAO01059 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 14951. 


90 


30 


2065 


gi200964 


Mus musculus 


serine 2 ultra high sulfur protein 


80 


30 




«ri200962 


A/Tiiq musculus 


serine 1 ultra high sulfur protein 


80 


30 






TTniTirt ^aniens 


HUMA- Human polypeptide SEQ ID 
NO 34. 


75 


28 


2066 


gi544724 


Cavia 


cholecystokinin A receptor; CCK-A 
receptor 


69 


29 


2066 


gi2541920 


Rattus 

n nrveffi cus 


cholecystokinin type-A receptor 


69 


29 


2066 


gi2114152 


Mus musculus 


cholecystokinin type-A receptor 


69 


29 


ZUO / 






BRCA1 


73 


22 


2068 


AAM40813 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5744. 


75 


29 


2068 


AAM39027 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2172. 


75 


29 


2068 


AAY25768 


Homo sapiens 


HUMA- Human secreted protein 
encoded from gene 58. 


75 


29 


2070 | gil334150 


Mus musculus 


unidentified reading frame (first ATG 


169 


28 
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CPA 

ED ! 
NO: 


Accession 
No. 




Descriotion 


Score 


% 
Identity 








atnos 210i 






2070 


gi557822 


Saccharomyces 
cerevisiac 


mal5, stal, len: 1367, CAI: 0.3, 
AMYH YEAST P08 640 
GLUCOAMYLASE SI (EC 3.2.1.3) 


133 


20 


2070 


1*2 A/| OCT 

gll3U43o/ 


oaccuaroniyces 
cerevisiae var. 

UloSLaUCUo 


oil i nn q m vl a CP 
U.IAJ allljf 1 aaC 


133 


20 


2071 


gil7983056 


Brucella 
melitensis 


BETA-HEXOSAMINIDASE A 


88 


29 


2071 


gll57391 / 


Haemophilus 
lmiuenzae x\.q 


mUlUUTUg ICblMailCC piULCJJU x\ ypiiiix^j 


81 


33 


2071 




rsruceua 


NTTROOFN RFGT IT ATT ON 
PROTFTN NTRB 


80 


26 


2073 | 


gi|17532255| 
reijrNx^ 'ti/of 
31.1| 


Caenorhabditis 

CiCgdJlo 


ankyrin and proline rich domains 


67 


29 


Z.\J fH 


cri 1001 07^0 


T-Tattia <5aAiftn<j 


BTEB5 


704 


97 


2074 


gil3195441 


Homo sapiens 


BTE-binding protein 4 


478 


64 






TMiic mncpulin! 


Hnnamine recentor reexdatint? factor 


452 


76 


2076 


AAE17482 


Homo sapiens 


ZYMO Human leucine-rich repeat-7 
(ZLRR7) protein. 


1326 


100 






riomo sapiens 


7VMD "WavpI cprrpfpd ni*Atein 

Z887300G2P. 


1326 


100 


2076 


Ann 1 1 OylO 

Add 1124/ 


LJa«via 0 rt-vHi an 

xiomo sapiens 


WVQF- TTumnTi ^IT TT-9 TiAmAlAOiie 
SFOTDNO'1612 


568 


99 


2077 


gil8893729 


Pyrococcus 
iunosus jjoiVL 
3638 ! 


protease iv 


74 


34 


2077 


AAB94745 


Homo sapiens 


HELI- Human protein sequence SEQ 
rn no-1 S70? 

XL/ iNv/. I J / y*** 


71 


34 


2077 


gil6413096 


Listeria innocua 


Iin0656 


68 


35 


2078 


gi60675 


Beet ringspot 
virus 


polyprotein 


75 


37 


2078 


gi|14743288| 

-aflVT) A/171 

ret|Xr w / 1 
91.11 


Homo sapiens 


similar to Alu subfamily J sequence 

vUXllalXliliaLLLpJU Wal llillg vii.ii.jr 


92 


58 


zu/o 


m^OAO^AQAI 1 

gl[ZUZOUoUi| 

rpfTKTP A9fi1 
rcI|lNJr OZU 1 

13.1| 


jtseei nngspoi 
virus 


pcnypi U LClil 


75 


37 


2079 


gi3834629 


Mus musculus 


diaphanous-related fonnin; pl34 
mDia2 


208 


67 


2079 


AAG74400 


Homo sapiens 


HUMA- Human colon cancer antigen 
orotein SEO ID NO:5 164. 


71 


36 


2079 


gi3171906 


Homo sapiens 


DIA-156 protein 


71 


36 


ZUoU 


cri1770R^1 ^ 
gll /Z7OJ 1 J 


XXUiliU oapxciio 


ranHiHatp tumor suntyressor Drotein 


125 


100 


2080 


gi7861733 


Homo sapiens 


low density lipoprotein receptor related 
nTAtpin-deleted in tumor 


125 


100 


2080 


gi8926243 


Mus musculus 


low density lipoprotein receptor related 
protein IJRP1B/LRP-DIT 


90 


63 


2081 


gi4574224 


Fundulus 
heteroclitus 


multidrug resistance transporter 
homolog 


343 


55 


2081 


gil6304396 


Pseudopleuronec 
tes americanus 


multidrug resistance transporter-like 
protein 


340 


52 


2081 


p3355757 


Gallusgallus 


ABC transporter protein 


328 


53 
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SEQ 
ID 
NO: 


Accession 
No. 


Species 


Description 


Score 


% 
Identity 


2082 


gi7532975 


bacteriophage 
phi-8 


P10 


67 


27 
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Table 3 



SEQID 
NO: 


Database 
entry ID 


Description 


♦Results 


1059 


BL00349 


CTF/NF-I proteins. 


BL00349H 15.70 9.710e-09 8-45 


1061 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.143e-10 29-61 
DM00215 19.43 8.322e-09 40-72 




DM01354 

X-/ i, »X \J ±mJ*J~ 


kw TRANSCRIPTASE REVERSE II 
ORF2. 


DM01354U 12.24 6.092e-12 80-99 




PR00944 


COPPER ION BINDING PROTEIN 
SIGNATURE 


PR00944E 9.18 7.132e-09 33-46 


1076 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 9.217e-09 23-35 


1089 


PR00308 


TYPE I ANTIFREEZE PROTEIN 
SIGNATURE 


PR00308C3.83 8.754e-10 16-25 


1080 


PR00456 


RLBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 9.658e-09 16-30 


10R0 


PR00341 

X JXvVJ^ X 


PRION PROTEIN SIGNATURE 


PR00341E 3.32 9.898e-09 24-43 


1099 


PR00886 


fflGH MOBILITY GROUP 
(HMG1/HMG2) PROTEIN 
SIGNATURE 


PR00886C 11.84 1.141e-12 28-46 


1107 


PR00833 


POLLEN ALLERGEN POA PI 
SIGNATURE 


PR00833H 2.30 3.077e-09 51-65 


1118 


BL00472 


Small cytokines 
(intercrine/chemokine) G-C 
subfamily signatur. 


BL00472A7.45 5.655e-09 1-12 


1118 


PR00655 


AUXIN BINDING PROTEIN 
SIGNATURE 


PR00655E 8.06 9.000e-09 88-103 


1119 
in/ 


BL00970 


Nuclear transition protein 2 proteins. 


BL00970C 14.80 8.183e-12 99-136 


1119 


BL00826 


MARCKS family proteins. 


BL00826B 12.51 4.279e-09 92-143 


1 110 

1117 


BL00348 


tj53 tumor an tie en nroteins. 


BL00348F 23.19 5.881e-10 93-135 
BL0O348F 23.19 6.857e-09 91-133 


1 1 10 


PD01457 


RIBOSOMAL PROTEIN 40S ZINC- 
FINGER METAL. 


PD01457A 16.51 8.216e-09 73-117 


1119 


BL00752 


XPA protein. 


BL00752B 19.17 7.866e-09 100-143 
BL00752B 19.17 8.979e-09 63-106 


1119 


DM01269 


303 kw ACTIVATING RAN 
GTPASE ISOZYME. 


DM01269A 23.35 9.446e-09 109-136 


1124 


DM01813 


EGG-LAYING HORMONE. 


DM018 13A 15.31 5.215e-09 15-42 


1127 


BL00452 


Guanylate cyclases proteins. 


BL00452A 17.52 1.170e-09 6-27 


1131 


BL00113 


Adenylate kinase proteins. 


BL00113B 20.49 9.897e-09 157-200 


1162 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-35 24-62 


1163 


BL00407 


Connexins proteins. 


BL00407B 14.23 9.775e-30 21-51 
BL00407C 14.61 2.500e-24 52-79 


1163 


PR00206 


CONNEXIN SIGNATURE 


PR00206B 13.75 1.957e-24 33-55 
PR00206A 11.35 6.559e-23 2-26 
PR00206C 15.16 7.469e-20 58-78 


1171 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.500e-28 35-73 


1 177 

IX// 


DM01803 


1 HERPESVIRUS 
GLYCOPROTEIN H. 


DM01 803C 7.00 7.240e-09 46-55 


1190 


PR00774 


GUANYLIN PRECURSOR 
SIGNATURE 


PR00774A 6.49 8.579e-10 69-81 


1195 


PD02059 


CORE POLYPROTEIN PROTEIN 
GAG CONTAINS: P. 


PD02059C 21.58 8.031e-09 100-140 


1197 


BL00472 


Small cytokines 
(intercrine/chemokine) C-C 
subfamily signatur. 


BL00472A 7.45 8.000e-14 1-12 


1213 


PR00437 


SMALL CXC CYTOKINE 


PR00437C 14.85 1.310e-16 33-51 
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Table 3 



Giro TTI 

NO: 


uataoase 
entry ID 


liescripuon 








FATVfTT V ^TfrNATIIRF 






DLr\J\J*t 1 X 


Qmall r , \/+rvVinPC 
Oil loll t»y tUrLLUCo 

^intprrriTift/rhpTnolnne^ C-x-C 

I lXLlXylvXXLlw \slX\*Xix\Jl\xi-L\; j v> v-/ 

subfamily signat. 


BL00471 23.92 7.960e-10 6-53 


1916 


PR 00108 

X JVVvJ v/O 


TYPE I ANTIFREEZE PROTEIN 

XXX 1 ' X 1 ^ X JUL X> 1 1 / 1 tm J ».J X X^>«^ * ' 

SIGNATURE 


PR00308C 3.83 5.208e-09 183-192 


1999 


PF008S2 


X UvUdVl LlGXlOXV/xttOw. 


PF00852F 15.97 1.409e-15 195-231 


1224 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 6.301e-ll 47-98 




JrKUU3*»U 


MTTQPAPTMTP Ml RFPFPTOR 

SIGNATURE 


PR00540A 10 24 7 174e-09 134-153 


1 O/l A 

lz4U 


TJT AAOQA 


iiiiiuuno gio Djui ids on u. nidj ux 
histocompatibility complex proteins. 


PiT 00290A 20 89 7 480e-10 160-182 
BL00290B 13.17 2.875e-09 226-243 


1 icq 
1258 


T>r> nmoo 

rKUU /yz 


PPPQTM /A 1 \ AQPAPTTP 

PROTEASE FAMILY SIGNATURE 


PR00797A 1 1 "54 5 500e-18 80-100 




dt nn i a 1 


jUuXaryouc ana VUal aopaxiyi 

proteases proteins. 


RL00141A 12 10 4 789e-15 87-102 
BL00141B 12.14 2.929e-10 228-239 


1 TAA 


JdLUUOI 0 


XxlSUaillC aUlU puUapilaUldCo 

phosphohistidine proteins. 


BL00616A 11 86 1 000e-09 136-143 


1 iai 


T^X/fni zii 7 


< 1™ TMTYT TPTNO YPMP?. 
O K.W 11N xJ \J K^xVs VJ -/Vl IViv^Z. 

MUSHROOM SPAC22G7.04. 


DM01417C 12 93 9 325e-12 361-372 
DM01417D 11.08 9.820e-12 400-415 


1302 


PR00049 


WHJVrS TUMOUR PROTEIN 

QTfTW ATTTPF 
olvjflN A 1 UJKJD 


PR00049D 0.00 6.067e-ll 324-338 


1311 


BL00926 


Lysyl oxidase copper-binding region 
proteins. 


BL00926B 13.84 7.453e-09 84-121 


1320 


PR00830 


ENDOPEPTIDASE LA (LON) 
SIGNATURE 


PR00830A 8.41 3.712e-09 29-48 




"DT AAA/ifi 
i3xAJUU4o 


rroiannne ri proieins. 


RT 00048 6 39 4 671e-10 58-84 
BL00048 6.39 4.908e-10 60-86 
BL00048 6 39 2 913e-09 59-85 
BL00048 6.39 5.950e-09 57-83 


1.345 


"DTTAA/lOyl 

JrrUU4z4 


ivb v proxein ^anu-repression 
transactivator protein). 


PFfMU24 A 14 34 2 436e-09 184-215 1 


1345 


tJT AAA/1 Q 


jrrotamme jr i proieins. 


PiT 00048 6 19 4 SS3e-10 178-204 
BL00048 6.39 6.513e-09 179-205 


1 'X'Z'X 

LDJD 


TYMYii^xl 


Irw TP AN^rPTPTASF PFVEPSE II 
ORF2. 


DM01354U 12 24 2 857e-15 82-101 


11A1 
1 jOJ 




Xl lb LUHC UCav/CLyxooc laxixiijr. 


PF00850B 10.13 5.154e-14 95-109 
PF00850C 14.55 9.063e-ll 132-148 


1 1RQ 


ppoorh 


POT T FN AT T FRGFN POA PI 
SIGNATURE 


PR00833H 2.30 6.423e-09 50-64 


1 1RQ 


pnomofi 

x JL/V/V/jUU 


PROTF1TM GT YCOPROTEIN 
PRECURSOR RE 


PD00306B 5.57 7.000e-09 59-69 




RT 00497 


Di^intePTins nroteins 


BL00427 13.93 7.698e-17 260-314 


1396 


PR00289 


DISINTEGRIN SIGNATURE 


PR00289A 13.62 5.667e-14 274-293 


141 £ 

1410 


RT ftftxl 1 0 


PVm+rtcvc+pm T nca A anH ■ncflPi 
x JLlUtUo yoicixi X Uoan auu poaxj 

proteins. 


BL00419B 22.23 9.489e-09 18-51 




PP0007^ 


RNase H 

xvi^iaow xx. 


PF000751 16.21 7.375e-ll 167-173 


1440 


BL00598 


Chromo domain proteins. 


BL00598 14.45 1.500e-15 112-133 


1 /MA 


ppnn^AA 

rssxJXJDyjH 


PTTR OMOD DM A TN SIGNATURE 

V^lXXx WlVX vyX^I WlVXxVLiN OlVJlNiTV X UJ\u 


PR00504B 9.12 5.200e-13 106-120 
PR00504C 11.19 6.510e-09 121-133 


1450 


PF00622 


Domain in SPla and the RYanodine 
Receptor. 


PF00622B 21.00 2.227e-09 93-114 


1451 


PD02935 


FATTY ACE) 

OXIDOREDUCTASE BIOSYNT. 


PD02935C 16.62 4.375e-16 59-86 


1467 


BL00479 


Phorbol esters / diacylglycerol 


BL00479A 19.86 3.000e-ll 130-152 
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binding domain proteins. 


BL00479B 12.57 3.340e-10 156-171 


1468 


PF00992 


Trot") on in 


PF00992A 16.67 5.563e-10 139-173 


1468 


BL00795 


Involucrin proteins. 


BL00795C 17.06 3.600e-09 193-237 


l*rOO 




FOS TRANSFORMING PROTEIN 

X J. XVuiX^I or WXNXYXXX ^ vj i iw iijjii 

SIGNATURE 


PR00042D 8.97 7.554e-09 141-162 


1474 


RT 00107 


Prntpin VinaQpc ATP-hindin$? reoinn 

i lulClll MllaoCa T\ 1 x 'umuiiig i^giuti 

proteins. 


BL00107A 18.39 9.308e-12 62-92 


1474 


PR0010Q 


TYROSTNF KTNASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 1.563e-09 62-80 


1474 


BL00239 


Receptor tyrosine kinase class II 

nTftteins 


BL00239C 18.75 4.205e-09 49-71 


1475 


BL00456 


Sodiumisolute symporter family 

LXL w X1XO . 


BL00456C 24.55 4.886e-28 15-69 


1480 


BL00983 


Ly-6 / u-PAR domain proteins. 


BL00983C 12.69 1.346e-09 36-51 


1482 


BL00979 ! 


G-protein coupled receptors family 3 
proteins. 


BL00979A 19.66 9.633e-12 74-121 




"DTW)^/S1 
JrUUZJOl 


TiT?TTTTO"RTOTTM ^VNTRFTA 

SYNTHASE. 


PD07S61R 12 71 9 308e-09 176-182 


1 ^o/? 

IjUO 


PT 007Q7 


XJ^oT elinolr l\«m70 TvrntPM ti c fjunilv 
XxCaL SILUwrL Hop /Vr piUlwxlxo XaiXXXiy 

TYrotPin^ 

^JlUl^lXld* 


BL00297H 15 46 9 625e-23 302-355 
BL00297D 11.95 6.063e-21 166-205 
BL00297E 18.56 6.077e-21 226-269 
BL00297C 9.51 9.667e-15 105-156 


1506 


PR00301 

X iwUJ V/ X 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR003011 12.76 3.208e-ll 320-336 


1513 


PR00130 

X XX.WX JW 


DNASE I SIGNATURE 


PR00130E 14.66 5.046e-09 237-266 


1515 


DM01242 

XJ 1XX\J x t»~*< 


3 THREONINE--TRNA LIGASE. 


DM01242A 20.32 5.286e-20 163-206 


1517 


BL00983 


Ly-6 / u-PAR domain proteins. 


BL00983B 8.19 5.935e-10 40-49 


1520 


BL00415 


Synapsins proteins. 


BL00415P 2.37 3.914e-10 138-173 


xJZU 


PP 00040 


WTT M'9 TTTMOTIR PROTFTN 
SIGNATURE 


PR00049D 0 00 3 746e-09 124-138 

X 1\V/ V/ \Jj \J ,\J\J «J • / ~ \J J 1A i X«/U 

PR00049D 0.00 1.000e-08 123-137 


1 


PF00075 
rruuu / j 


XVlNdbC XX. 


PF00075F 12 87 5 500e-10 127-137 


1 ^7 


PR004/v* 


F-fT ASS P4S0 GROUP I 
SIGNATURE 


PR00463F 17 63 5 219e-13 288-306 
PR00463A 11.40 8.714e-12 52-71 
PR00463B 17.50 5.041e-10 76-97 


1537 


PR00385 


P450 SUPERFAMILY 
SIGNATURE 


PR00385C 16.94 6.318e-09 289-300 


1538 


PR00709 


AVIDIN SIGNATURE 


PR00709A 4.60 5.585e-09 19-37 


1 SSI 
1 JJ J 


xJxVlUx JjH 


Vm TTt ANSCRTPTASF RFVFRSF IT 
ORF2. 


DM01354Y 10 69 6 423e-16 1 13-152 


1 J JO 


PDOIO^A 


PROTFTN 7TNP FTNGFR ZINC- 

lIVU X JjrXX^f f *l ITi X XI^VJXjXV <f j I 1 i w 

FINGER METAL-BINDING NU. 


PD01066 19.43 6.400e-25 70-108 


1 S64 


PF0058Q 


Phacrp intpoTaQP farrvilv 

X ilagC XlXl&glODw xaxxxujr. 


PF00589B 16.17 1.621e-ll 158-171 
PF00589C 14.62 9.609e-10 183-194 j 


1 JUU 


RT 00008 


lMsmHplatp Tanpmjmp / mncnnate 
lactonizing enzyme family signa. 


BL00908B 37.71 6.455e-13 191-245 


1Sfi7 


PR 0070? 


APR TFT AVTM RFSISTANCE 

/\vyXvXJr i^i r\ v xxn xvXjljxo 

PROTEIN FAMILY SIGNATURE 


PR00702A 14 92 2.421e-25 8-32 

X XXW » V«Tl A ■ « ✓ A^ Aa • TA« A W A»a^ V W 

PR00702B 12.77 9.690e-18 36-54 


1 570 
1 J /v 


PT ill 04.7 


xlcaVy - lXlCLai~aooUUlalCU UUlllaili 

proteins. 


BL01047A 13 50 5 125e-17 75-97 


1575 


DM01354 


kw TRANSCRIPTASE REVERSE H 
0RF2. 


DM01354U 12.24 9.429e-15 80-99 


1606 


PF00642 


Zinc finger C-x8-C-x5-C-x3-H type 
(and similar). 


PF00642 1 1.59 2.575e-l 1 197-207 


1610 


DM01354 


kw TRANSCRIPTASE REVERSE H 
0RF2. 


DM013541 15.55 7.702e-34 348-388 
DM01354G 11.57.3.625e-32 277-307 
DM01354H 18.00 2.528e-23 308-347 
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DM01354F 14 56 4 088e-ll 241-276 

A/iTjlw A -mJ TA A • *J \J »\J WWW A A A«T A A* f X/ 


1616 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28 27 2 263e-25 32-85 


1627 


PR00121 


SODIUM/POTASSIUM- 
TRANSPORTING ATPASE 
SIGNATURE 


PR00121A6.71 1.000e-08 15-29 


1630 


PR00824 


HEPATIC LIPASE SIGNATURE 


PR00824A7.81 7.214e-22 6-24 


1640 


BL00359 


Ribosomal nrotein LI 1 nroteins 


BL00359C22 18 1 155e-ll 93-126 


1641 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080A 9.32 8.839e-10 134-145 


1 fvdl 


PP000R1 


DFHYDROGFNASF FAMTT Y 
SIGNATURE 


PP000R1 A 10 SI 9 000p-1 9 4S-69 
PR 0008 IF 17 54 1 78^e-10 238-255 
PR00081B 10.38 2.227e-09 134-145 


1641 




Srifirt-rViain 

VJllVJ'A I. \s 11 CI 111 

dehydrogenases/reductases family 
proteins. 


BT 00061 A 9 41 9 053e-10 134-144 
BL00061B 25.79 6.860e-09 197-234 


1666 


BL01257 


Ribosomal protein LlOe proteins. 


BL01257D 18.80 2.973e-15 59-98 


1667 


BL01241 


Liiik domain nroteins 


BL01241 35 81 8 579e-37 180-232 
BL01241 35.81 7.835e-14 289-341 


1667 


BL00086 


Cytochrome P450 cysteine heme- 
iron ligand proteins. 


BL00086 20.87 3.377e-09 283-314 


1668 


PR00671 


INfflBIN BETA B CHAIN 
SIGNATURE 


PR00671A 8.36 8.088e-09 4-22 


1672 


BL00674 


AAA-protein family proteins. 


BL00674E 15 24 5 680e-15 31-50 


1682 


PF00075 


RNase H. 


PF00075A 14 44 4 400e-13 73-89 

A A WV/W » A A x • T F T ♦ ■ V W W A «^ / «hJ W *r 

PF00075C 11.58 8.442e-09 152-163 


1689 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19 43 6 471e-27 268-306 


1689 


PR00788 


NITROPHORIN SIGNATURE 


PR00788A 9.79 6. 108e-09 3-15 


1692 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 4.759e-10 32-83 


1697 


PR00423 


CELL DIVISION PROTEIN FTSZ 
SIGNATURE 


PR00423E 7.36 4.038e-09 20-41 


170/; 




mvoiucTin proteins. 


PT 0070^ 17 OA S 1Q<t> lO 1 ftS 790 


1709 


BL00514 


Fibrinogen beta and gamma chains 

C~* tArminol Hattiqi'ti nrAf/ainc 

leniiiiia 1 uuroain proicixio. 


BL00514C 17.41 3.618e-25 68-104 

T4T OOS 1 AV( \A OS A 74S#» 1 A 9^0 9 S4 
BL00514G 15 98 6 566e-14 108-227 
BL00514E 14.28 8.286e-14 128-144 
BL00514D 15.35 2.915e-12 109-121 


1714 


PF00878 


Cation-indeoendent mannose-6- 
phosphate receptor repeat proteins. 


PF00878T 17 51 3 818e-0941-67 

A A WWW / w A A f w A mJ • \J A \J ^ T A w / 


1715 


PF01140 


Matrix nrotein fMA^ n 1 5 


PF01140D 15.54 4.872e-09 123-157 


1715 


PF00992 


Trooonin. 

A AvUVIIIIm 


PF00992A 16.67 6.451e-10 109-143 
PF00992A 16.67 3.724e-09 98-132 
PF00992A 16.67 6.684e-09 96-130 


1718 


PD02474 


SYNTHASE SMALL SUBUNIT 
ACETOLACT. 


PD02474B 21.08 7.940e-10 92-130 


1725 


BL00412 


Neuromodulin ( GAP-43 1 oroteins 


BL00412B 10.60 1.000e-10 46-82 


1725 


PR00215 


NEUROMODULIN SIGNATURE 


PR00215C 13.98 6.1 16e-10 54-74 


1725 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688G 16 45 3 160e-09 119-150 

i/lTAv A WW W A W « • *mf mJ • A WWW \J *J A A ^ A W 

DM016881 14.97 6.885e-09 107-154 


1725 


PD02870 


RECEPTOR INTERLEUKIN- 1 
PRECURSOR. 


PD02870B 18.83 8.564e-09 303-335 


1727 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 7.750e-21 185-215 


1727 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 7.176e-12 185-203 
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1727 


BL00239 


Receptor tyrosine kinase class U 

jJXVJlVXIXD. 


BL00239B 25.15 4.387e-09 119-166 


1728 


BL00415 


Synapsins proteins. 


BL00415Q 2.23 8.115e-09 52-87 




PTifll 970 


PFCFPTOR FP 

Z\_X_/\~/X_/X X V-/XV. X 

TMMT TNOGT .OBI JLTN AFFIN 

XXYXXVXUl^l V/VJlwVSXJ Uuiii ^Vi X XX ^» 


PD01270B 22.18 5.567e-18 75-111 
PD01270C 19.54 1.167e-17 118-146 
PD01270A 17.22 4.960e-14 21-60 
PD01270D 24.66 4.284e-09 152-187 


1736 


PD02346 


PHOTOSYSTEM II PROTEIN 
PRECURSOR PHOTOSYNTHESIS. 


PD02346A 9.24 8.851e-09 6-17 


1741 


BL00415 


Synapsins proteins. 


BL00415Q 2.23 6.777e-09 317-352 


1744 


RT 00479 


Pliorhol esters / diacvlfflvcerol 
binding domain proteins. 


BL00479B 12.57 1.000e-08 33-48 


1750 


PR00763 


COAGULIN SIGNATURE 


PR00763B 8.39 6.457e-09 41-60 


1754 


PR00276 


INSULIN A CHAIN SIGNATURE 


PR00276A 11.84 7.840e-09 46-55 


1 /j«> 




FHQ TP AMCTOPMINn PPOTFTN 

CKJO X JVrVLN or V>XSJ.V1JL1>I \J x JWs x x-rXx^t 

SIGNATURE 


PR00042D 8 97 2 565e-09 164-185 


1755 


PF00922 


Vesiculovirus phosphoprotein. 


PF00922A 19.17 5.759e-09 99-132 


177R 
1 / / o 




OT FACTORY RECEPTOR 

V/Lrl rVV*' X V-/XV 1 XVX-#V^X_>X X V-/XX 

SIGNATURE 


PR00245A 18.03 9.836e-14 59-80 
PR00245C 7.84 1.540e-13 237-252 
PR00245B 10.38 2.125e-13 176-190 


1778 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 L474e-12 90-129 


177R 


PR00534 


MELANOCORTIN RECEPTOR 
FAMILY SIGNATURE 


PR00534A 11.49 4.729e-09 51-63 


1778 
1 / to 


PP 009^7 


PHODOPSIN-LUCE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 3.613e-09 26-50 
PR00237C 15.69 7.525e-09 104-126 


1787 
Hoi 


pp nnnn7 


POMPT FMPNT PI O DOMAIN 
SIGNATURE 


PR00007B 14.16 5.114e-15 146-165 
PR00007A 19.33 7.052e-10 119-145 


1787 


PR00524 


CHOLECYSTOKININ TYPE A 
RECEPTOR SIGNATURE 


PR00524F 5.36 4.351e-09 70-83 


1787 


DM00250 


kw ANNEXIN ANTIGEN 
PROLINE TUMOR. 


DM00250B 13.84 6.595e-09 82-105 


1787 


BL00415 


Synapsins proteins. 


BL00415N 4.29 7.372e-09 62-105 


1787 


BLOl 113 


Clq domain proteins. 


BL01113B 18.26 3.786e-23 125-160 
BLOl 113A 17.99 7.968e-15 73-99 
BL01113A 17.99 5.091e-14 70-96 
BLOl 113A 17 99 5 295e-ll 64-90 
BL01113A 17.99 8.568e-ll 79-105 
BLOl 1 13A 17.99 8.977e-l 1 67-93 
BL01113A 17.99 4.635e-09 82-108 
BLOl 1 13A 17 99 6 192e-09 76-102 
BL01113A 17.99 7.750e-09 61-87 


1787 
1 to I 


RT 00490 
DLUUnZU 


Cfiprap't rprpntnT rpnpat TvroteirK 

domain proteins. 


BL00420A 20.42 8.691e-ll 73-101 
BL00420A 20.42 9.673e-l 1 70-98 
BL00420A 20.42 2.180e-10 55-83 
BL00420A 20.42 8.062e-09 52-80 


1789 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W 


DM01930E 15.41 2.964e-33 45-89 


1795 


DM01688 


2 POLY-IG RECEPTOR 


DM016881 14.97 7.480e-10 107-154 
DM01688J 14.69 4.455e-09 60-96 


1796 


PF00075 


RNase H. 


PF00075J 15.78 4.115e-13 115-132 


1802 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 4.130e-ll 86-98 


1802 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 1.600e-10 110-126 
BL00028 16.07 6.100e-10 70-86 


1802 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 9.438e-10 83-92 
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iri? 


PD00078 


RFPRAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 4.130e-09 157-169 


1824 


PF00628 


PTTTi-fineer 


PF00628 15.84 5.500e-13 78-92 


1833 


PF00075 


RNaseH. 


PF00075B 12.56 4.732e-10 156-166 


1 OJ J 


ppohq^q 

x JSAjvyoy 


P9TTP.TYPF 7TNC-FINGER 
SIGNATURE 


PR00939A 8.95 3.045e-09 137-146 




PP 00833 


POT T FN ALLERGEN POA PI 
SIGNATURE 


PR0O833H 2.30 3.192e-09 244-258 






TTViinnitin rflrVimrvl-terminal 
fivrlTiVla^i** fa mil v 2 TYTOteinS. 

11 Y Ul vICm) xmxiii Y aw r 


BL00972D 22.55 3.348e-ll 168-192 


lOJ / 


PF00424 


PPVnrntpin f anti-retjression 
transactivator nroteiri). 


PF00424A 14.34 8.085e-09 71-102 


1860 

X ouu 


PR00221 


CAULIMOVIRUS COAT PROTEIN 
SIGNATURE 


PR00221H 12.82 2.410e-09 184-197 


1864 • 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 1.136e-10 214-252 


1866 


BL00155 


Cutinase serine nroteins. 


BL00155D 26.87 5.337e-09 19-67 


1895 


PF00075 


RNase H. 


PF00075F 12.87 7.353e-10 93-103 


101 1 


RT 0098^ 

DL«Uu70J 


T v-6 / u-P AR domain nroteins 


BL00983C 12.69 6.365e-09 101-116 


1911 


BL00272 


Snake toxins proteins. 


BL00272C 8.27 1.000e-08 105-116 




ppnrrcnK 

x jsaji/ouo 


TYPF T ANTTFRFF7F PROTEIN 

1 IxJu 1 All 1 XX XX.1.VI // jLj X IvW X 1/11 1 

SIGNATURE 


PR00308A 5 90 6.795e-l 1 64-78 
PR00308C 3.83 2,385e-10 67-76 






RTBOSOMAL PROTEIN P2 

I\XU V7 O wlVi/VJLi X 1VV/ X XvXXl X ^ 

SIGNATURE 


PR00456E 3.06 9.438e-10 57-71 


1925 


PR00833 

XT AVv v/UJJ 


POLLEN ALLERGEN POA PI 

X V-/ ■ ii i» ■* i ^ X> 1/1^1 VXvVJ AW A, ^ X x A A A 

SIGNATURE 


PR00833H 2.30 6.654e-09 59-73 


17JV 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 5.263e-10 107-116 


1935 


PF00075 


RNase H. 


PF00075J 15.78 2.309e-12 81-98 


1940 


PF00075 


RNase H. 


PF00075F 12.87 3.864e-09 74-84 


1 0^9 




J-»XJ U ^11N JD-xVlOxx XVCrX JCtXXX 

SIGNATURE 


PR00019B 11 36 3 250e-10 184-197 
PR00019A 11.19 5.667e-09 187-200 


1 Q^xt 

lSO*f 




IVlaLxlAJLuo CyoLClUC aWllvll* 


BL00546A 19 62 8 105e-30 77-106 


1954 


BL00023 


Type II fibronectin collagen-binding 
domain proteins. 


BL00023 24.31 4.682e-35 340-376 
BL00023 24.31 2.969e-28 282-318 
BL00023 24.31 9.526e-24 224-260 


1954 


PR00138 


MATRDON SIGNATURE 


PR00138B 15.82 5.500e-18 144-159 
PR00138A 15.14 8.773e-16 97-110 


1954 


BL00024 


Hemopexin domain proteins. 


BL00024B 21.53 9.591e-33 118-151 
BL00024A 11.49 2.800e-13 97-107 
BL00024C 22.98 7.796e-l 1 164-212 


1954 


PR00013 


FIBRONECTIN TYPE H REPEAT 
SIGNATURE 


PR00013C 12.29 1.000e-20 372-387 
PR00013C 12.29 3.571e-15 314-329 
PROOOnr 17 29 7 800e-14 256-271 
PR00013A 12.26 5.500e-13 344-353 
PR00013B 14 75 1 237e-ll 355-367 
PR00013B 14.75 4.000e-09 297-309 
PR00013A 12 26 5 333e-09 286-295 
PR00013A 12.26 7.833e-09 228-237 




■rt 01 182 


frlvrncvl ViwIrnlaQf*^ familv 3S 
vjiyuudYX xxyuxviioovo xaniii y — ' *j 

proteins. 


BL01 182A 21 39 3 357e-34 77-1 19 


1957 


PR00742 


GLYCOSYL HYDROLASE 
FAMILY 35 SIGNATURE 


PR00742B 15.52 2.653e-14 78-96 
PR00742A 13.75 6.914e-10 57-74 


1958 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 8.200e-15 214-235 


1964 


PR00727 


BACTERIAL LEADER 
PEPTIDASE 1 (S26) FAMILY 


PR00727A 12.93 7.000e-09 9-25 
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SIGNATURE 






PF00075 


RNaseH 


PF00075D 10.71 7.188e-09 71-81 


1966 


PF00075 


RNaseH. 


PF00075C 11.58 9.786e-ll 110-121 
PF00075B 12.56 1.878e-10 78-88 ! 




jjiviuuoyz 


3 P FTP O VTR A T PP OTFTN A SF 


DM00892C 23.55 4.082e-ll 314-347 


1970 


PF00075 


RNaseH. 


PF00075J 15.78 8.571e-10 335-352 


1973 


PF00589 


Phage integrase family. 


PF00589B 16.17 1.450e-14 101-114 


1974 


tit r\r\/n c 

BL00675 


Sigma-54 interaction domain 
proteins ATP-binding region A 
proicins. 


dt nnfi7<5R 94 07 1 000e-24 118-172 
BL00675C 13.51 6.400e-24 183-210 
BL00675D 12.03 1.750e-09 245-254 


1987 


PR00153 


CYCLOPHILIN PEPTIDYL- 

PPHT VT r'TQ TP AT\IQ 
TQOA/TPPAW ^TfTKTATTTPF 


PR00153B 11.57 1.500e-17 52-64 
PR001 51A 12 98 4 255e-10 23-38 


1987 


dL/UUI /u 


v^yciopniun-iype pepuayi-pruiyi 

ITdUb loUlilClaoC MgAlalUI. 


RI 00170B 20 97 6 250e-33 47-86 ! 
BL00170A 17.08 2.309e-09 17-43 


1998 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FTNGF.R METAL-BINDING NU 


PD01066 19.43 7.750e-37 27-65 
PD01066 19.43 8.863e-ll 68-106 


1999 


PF00992 


Troponin. 


PF00992A 16.67 3.487e-09 108-142 


1O0O 


RT fift99A 


Plnffirfn liortt rViain rvrntpiriQ 

V ,I«1LMI 111 Alglll vlAClAAi. LALUlWlAIO. 


BL00224B 16.94 7.055e-09 96-148 


1999 


BL00422 


Granins proteins. 


BL00422C 16.18 8.059e-09 117-144 


9nft1 
ZUUI 


rt nnni o 


A ft-inin —i~\fT\f* ar'tin-Viin^lirkCT HfiTTllllTl 

AC iimu-iypc auuxi-uiiiuiiig uuiiiaiu 

proteins. 


BL00019B 13.34 7.158e-14 261-283 


ZUU1 




Irw TP AN^PPTPTASF REVERSE II 
ORF2. 


DM01354U 12.24 3.500e-13 345-364 


ZUUo 


PTVil 71 Q 


PPFPTTP90P fiT YPOPROTFTN 
QjnxrAT PF 


PD01719A 12 89 3.483e-16 63-90 


2011 


BL00282 


Kazal serine protease inhibitors 
ioixiAiy proiciiii). 


BL00282 16.88 6.577e-10 127-149 


2011 


BL00222 


Insulin-like growth factor binding 
proxenis. 


BL00222B 1 1.09 6.940e-10 74-89 


2011 


BL00621 


Tissue factor proteins. 


BL00621A 8.69 6.473e-09 5-22 


zUlz 




DTJnTTJTW WfiXFQTPTTPTTTP AT P 
VP1 R 


Pr>0?5MP 1^ 51 9 634e-10 74-128 

rX/V^JUJv> 1 J.J i ^ . U«/^t— X \J /"T-i.X.t* 


9fi1'* 
ZUiO 


pp no 194 
x ixuu 1 z*t 


ATP 9VNTH A SF P FRT INTT 

nli O A IN AaATVOA^ \s OUDUlill 

SIGNATURE 


PR00124A 8 81 5 655e-09 58-77 


9fii i 

ZU13 


Jr Jtvl/U / oj 


1WT A TOP TTsTTP TNT^TP PP OTFTN 

lVLrW WAV AAN 1 AVlli OlV/ A Avw A A-/A1N 

FAMILY SIGNATURE 


PR00783C 13 54 8 981e-09 48-67 


2034 


PF00075 


RNase H. 


PF00075F 12.87 6.523e-09 183-193 


ZUJ / 


ujuuu^zo 


i ropomyosins proicixio. 


BL00326D 8 76 9 327e-09 115-155 


OH/1 0 


rKUUO / 1 


TT\IT4TOT>J "RPTA R PKATM 
hn n i ri hn DC xr\ D ^ax/vian 

SIGNATURE 


PR00671B 4 29 8 767e-10 138-157 


9fK9 
ZUjZ 


XMJUZH03 


FT FMFNT TR ANSPOSABLE 

XvXvXvlVXXwrlN A A AVfVM Ui V/U/uJA(i^ 

INSERTION PROTEIN 
TRANSPOSITION DNA. 


PD02455C 29 23 5 230e-09 225-276 


ZU jo 


rr\)\)\)l 0 


ASiNdoC AA. 


PF00075J 15 78 9 000e-10 81-98 


2074 


PD00066 


PROTEIN ZINC-FINGER METAL- 

aDaINaJI. 


PD00066 13.92 4.000e-13 62-74 


2074 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 4.462e-l 1 59-68 
PR00048B 6.02 1.000e-10 89-98 
PR00048A 10.52 9.609e-10 101-114 


2074 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 9.100e-13 104-120 
BL00028 16.07 1.000e-08 46-62 


2076 


PR00019 


IJEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 1.900e-ll 106-119 
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* Results include in order: Accession No., subtype, e-value, and amino acid position of the signature in the 
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Table 4 



SEQ 
ID 

NO: 


Pfam Model 


Description 


Hr-ValUc 


□turc 


No: of 

Pfam 

Domains 


Position 
of the 
Domain 


1050 


FAA_hyarolase 


rumarylacetoaceiaie yr/\/\j uyuiui<u>e 
lam 


0 64 


-89 1 

U7i 1 


1 


22-143 


1066 


— - — - — : 

rubredoxm 


Rubredoxin 


7.2 


-11.1 




4-37 


1076 


ank 


Ankynn repeat 


0.01 


22.5 




25-57 


1076 


sodfe_C 


Iron/manganese superoxide dismutases, 
C-term 


3.9 


-67.9 


v 


38-124 


1076 


T^T TT7TJ <") 

DUF232 


Jrutauve transcriptional reguidiur 


O, 1 


-29.1 


T 

— 


134-254 


1099 


HMG box 


rlMu (nign mouuuy group j doa 


C 
o 


-22,4 




17-61 


1109 


T TTI ATI T 'XT'ZT 

UPAR LY6 


u-Jr AR/JL/y-o aomain ] 




-6 2 




34-112 


1110 


ldl_recept_a 


.Low-density lipoprotein receptor 
domain 




36 0 


1 


196-240 


1 1 1 A 

1110 


CUB 


lud aomain 


0 38 

U.JO 


-27.8 




52-161 ] 


1118 


rvt 


Reverse transcriptase 


0.95 


-46.1 




38-207 


1 A or* 

1125 


adenylatekinase 


Adenylate kinase 


0 00037 


-77 6 




13-103 


1162 


ITT) A "D 

KRAB 


IsJvAJd dox 


1 le-23 


92.1 




22-62 


1163 


connexin 


Connexin 


3.1e-23 


90.6 




1-130 ! 


1171 


KRAB 


JvKAJt> DOX 




86.2 




33-73 


■t 1 rvo 

1 193 


MHC_I 


Ulass l luistocompauDiuty antigen, 
domains 




1.1 




29-205 


1209 


DOMON 


DOMON domain 


1.9e-12 


54.8 




102-215 


1213 


IL8 


Small cytokines (intecrine/chemokine), 
inter J 


0.59 


-7.8 




18-55 


1218 


cys rich FGFR 


Cysteine rich repeat 


A A 


-1 1 0 

1 1 .vs 




28-76 


1222 


Glyco transi 10 


Glycosyltransferase family 10 


£ £p 06 


"J*T. 1 


- 


1-322 


1240 


ig 


Immunoglobulin domain 


1.6e-06 


35.1 


2 


41- 1 

124:156- 

230 


1258 


asp 


Eukaryotic aspartyl protease 


8e-06 


-110.8 


1 


19-241 i 


1280 


Tpv /*"VA jr/"YXT 

DOMON 


JDOMU1N domain 




"1U.U 


i 
i 


35-117 


1288 


PDZ 


PDZ domain (Also known as DHR or 
GLGF) 


1.1 


0.4 


1 


7-73 


1301 


Exonuclease 


Exonuclease 


7 /Id 




1 
i 


397-479 


1311 


Gemini_mov 


Geniinivirus putative movement 
protein 


J./ 


-AO ^ 


i 
i 




1341 


fii3 


Fibronectin type HI domain 


6 6p-^6 


13? 7 




109- 

200:212- 
301 


1345 


Collagen 


Collagen triple helix repeat (20 copies) 


7.3 


-65.8 


1 


185-243 


1 1£K 

I JOj 


Anudase 


/\miaase 


0.017 


-178.9 




68-276 




(jaiactosyx l 


vJalaClU S>y iUallolcl aoc 


7.1e-44 


159.2 




113-309 


1375 


Glyco transf 25 


Glycosyltransferase family 25 


3 


-77.1 


1 


146-293 


1381 


LiKAM 


OK-AJVi aomam 


U.UC"1T 


59 6 




65-116 


1396 


rep_M 1 ZB_prop 
ep 


Keproiysin iamny propepnae 


1 4e-27 


105 1 

1 \fmJ* 1 




75-191 


1396 


disintegrin 


jjisintegrin 


2 6e-10 


47.7 




243-318 


1398 


oK_ciiannei 


(jaicium-acuvatea ois. poiaobiuiu 
channel 


1 8e-06 

1 iOv VU 


34.9 




1-57 


1413 




Immunoglobulin domain 




9 1 
7.i 




29-88 


1416 


dU irase 


cu irase 


0 00044 


9.6 


j 


111-237 


1420 


Folate rec 


Folate receptor family 


1 7 


-111.2 




14-175 


1434 


lectin c 


Lectin C-type domain 


1.5e-05 


28.0 




233-319 


1440 


chromo 


'chromo 1 (CHRromatin Organization 
Modifier) 


4.6e-ll 


50.2 




92-133 


1449 


PMSR 


Peptide methionine sulfoxide reductase 


0.0089 


-65.8 




4-79 


1450 


SPRY 


SPRY domain 


9e-26 


99.0 




109-240 
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Table 4 



ED 


pfnrn IVTnHol 

r iam ivioaei 




E-value 


Score 


No: of 

Pfam 

Domains 


Position 
of the 
Domain 


1451 


^JinciC* HpTivdrata 

1 VI uU \l\slxj UX u Id 
g 


A/TaoC! like domain 

XTldV/V/ iXXvV/ vi-w l I m ill 


2.1e-15 


64.6 


1 


31-152 




NTP transf 2 


Nucleotidyltransferase domain 


2.6e-12 


54.3 


1 


121-234 


1467 


DAG PE-bind 


Phorbol esters/diacvlelvcerol binding 
dom 


8.7e-05 


27.4 




130-180 


1467 


DC1 


DC1 domain 


0.66 


11.2 


1 


141-172 


1470 

Iff V 




jmjC domain 


0.46 


-18.2 




166-262 


1474 

X*t / » 


n1cina<?e 


Protein kinase domain 


0.0019 


-85.7 


1 


2-187 


1475 


SSF 


Sodium: solute symporter family 


0.13 


-177.1 


1 


1-311 


1478 

X"T / O 


dTJTPase 


dUTPase 


7.6 


-37.5 


1 


2-98 


1479 

X*t / z? 


fh3 


Fibronectin type HI domain 


l.le-19 


78.9 




14-100 


1485 

l*tOJ 


1 AlxlOVyXX 


RNaseH 


0.36 


-28.0 


1 


59-175 


1488 


NTR 


NTR/C345C module 


0.044 


-6.1 


; 


293-398 


1 506 


nor /v/ 


W^n70 nrotein 


1.6e-13 


38.3 




61-424 


1^17 
I j 1 / 


TTPAT? T Y6 


ii-P AR/T .v-6 domain 


0.33 


-8.2 


i 


44-106 


1 ^0 

lJjU 




RNaseH 


0.011 


-11.7 


i 


64-155 


1537 


p450 


Cytochrome P450 


2.1 


-176.6 


j 


31-316 


1 ^7 


DTsIA 1ianQp OR 

lyxN/V XIg«£>y__V/XJ 


NAD-denendent DNA liease OB-fold 
domain 


9.2 


-42.9 




200-256 


1558 


KRAB 


KRAB box 


1.8e-18 


74.8 




68-108 


1564 


Phage integrase 


Phage integrase family 


1.2e-09 


45.5 




39-204 


1566 


MR_MLE 


Mandelate racemase / muconate 
lactonizing en 


0.00079 


-24.5 




153-352 


1 ^70 


xxlYLrV. 


Wpaw-mptfl1-a<jQnf!iated domain 

ili^ V j illvUU uOdUviaivU uwxxjaxix 


6.6e-13 


56.3 




71-131 


1580 




Immunoglobulin domain 


0.99 


15.2 




23-131 


1601 


WD40 


WD domain, G-beta repeat 


2e-08 


41.5 


3 


39- 
75:83- 
118:126- 
162 ! 


1606 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H type 


0.094 


19.3 


3 


105- 

129:141- 
173:183- 
209 


1612 


zf-CCHC 


Zinc knuckle 


2.1e-05 


31.4 


2 


167- 

184:202- 
219 


1618 


rnaseH 


RNaseH 


6.3e-14 


59.7 


1 


24-144 


1618 


Integrase Zn 


Integrase Zinc binding domain 


3.8e-07 


37.2 


1 


146-185 


1 £1 R 
xOlo 


TYTTF7?xl 


T^rtTViaiTi r\f i inlmr\wn fiinpfiOn 
xVUlxlaxIx ui iXxJJvixvj WXl lUU^UUli 

fDTJF224 , l 


9.3 


-7.0 


1 


104-186 


1641 


adli ^hort 
ami oixux l 


short chain dehvdroeenase 


4.6e-32 


119.9 


1 


42-309 


1667 


Xlink 


Extracellular link domain 


2.9e-83 


290.0 


2 


162- 

267:273- 
364 


1667 




xxximimoglobulin domain 


0.0015 


25.2 


1 


61-145 


16R9 


rvt 


Reverse traiiscrintase 


3.1e-31 


117.2 


1 


56-238 


1681 

1 UOJ 


Gae d30 


Gag P30 core shell protein 


2.9e-33 


124.0 


1 


8-197 


1689 


KRAB 


KRAB box 


4.9e-22 


86.6 


1 


266-306 


1609 


iihiaiiitin 


Ubiouitin familv 

\J 1/1UU1U11 At* 1 1 II * Jf 


0.00061 


26.5 


1 


17-91 


1709 


fibrinogen_C 


Fibrinogen beta and gamma chains, C- 
tenn 


7.9e-85 


295.2 


1 


37-255 


1713 


HOK GEF 


Hok/geffemily 


2.4 


-7.8 


1 


7-54 


1716 


Gag_p30 


Gag P30 core shell protein 


0.0036 


-49.7 


1 


64-229 


1721 


rnaseH 


RNaseH 


0.011 


-11.7 


1 


207-350 


1722 


dUTPase 


dTJTPase 


0.37 


-22.9 


1 


93-217 
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m 


r IalTl lYXOUcl 


ii c&iripuun 


E-value 


Score 


No: of 

Pfam 

Domains 


Position 
of the 
Domain 


179S 




Immunoglobulin domain 


4.2e-13 


57.0 


2 


80- 

141:259- 
320 


1725 i 


10 


IQ calmoduHn-binding motif 


4.3e-05 


30.4 


1 


49-69 


1727 


pkinase 


Protein kinase domain 


3e-21 


84.0 


1 


71-267 


1728 


Fringe 


Fringe-like 


5.9 


-112.6 


; 


165-370 


1734 




Immunoglobulin domain 


0.014 


22.0 




117-170 


1737 

X / J i 


PP2C 


Protein nhosnhatase 2C 


0.0067 


-50.5 




37-273 


1738 


SH3 


SH3 domain 


1.7e-05 | 


31.7 


-j — 


102-159 


174ft 




RNase H 


0.0042 


-7.3 




126-270 


1744 


DAG_PE-bind 


Phorbol esters/diacylglycerol binding 
dom 


2.9 


-11.1 


1 


26-55 




PTTTi 


Pim-fin opt 

■ ill .1-1 lllgCI 


3.3 


-14.7 




9-61 


1760 


GARSJST 


Phosphoribosylglycinamide synthetase, 

XN 


8.2 


-62.0 


1 


35-95 


l t\j\J 


A i"m o /Hi 1 1 /"\ c*>rr 
/-VI IllrfllllHJ 


ArmnHi11n/Hptfl-ratf*niTi-lik'e reneat 

Jt\l. 1XXA VXXiXVs/ LI C la%fd IVXL1X1 HIWv 1 Jr vul 


9.1 


8.7 


2 


44- 

84:131- 
171 


1778 


7tm 1 


7 transmembrane receptor (rhodopsin 
family) 


le-12 


55.7 


1 


41-276 


1778 


YCF9 


YCF9 


3.1 


-18.5 


1 


203-258 


1787 


Clq 


Clq domain 


le-05 


13.2 


1 


111-230 


1787 


Collagen 


Collagen triple helix repeat (20 copies) 


0.0043 


-3.0 


1 


50-107 


17R9 


imiP 


imiC^ domain 


0.00078 


12.0 


1 


52-241 






Trnmiinoplobulin domain 

IIIIHIM 1 flVfiiv U/ 1 * III! Uvll H*i 1 1 


0.0037 


23.9 


1 


64-141 


1796 


rve 


Integrase core domain 


2.6e-28 


107.5 


1 


20-174 


1802 


zf-C2H2 


Zinc finger, C2H2 type 


6e-15 


63.1 


2 


68- 

90:108- 
130 




f Hall II II 




0 00054 


18.6 


1 


26-131 


1 BIO 
lo iZ 


auk 




3.6e-23 


90.4 


3 


159- 

191:205- 
237:244- 
276 


1824 


PHD 


PHD-finger 


1. le-12 


55.6 


1 


62-110 


1826 


PAP assoc 


PAP/25A associated domain 


1.5e-06 


35.2 


1 


101-155 


1827 


ig 


Immunoglobulin domain 


1.6 


13.4 


1 


29-102 


1830 


RhoGEF 


RhoGEF domain 


3.3e-06 


24.0 


1 


110-280 


1830 


PH 


PH domain 


2.8 


6.7 


1 


356-451 


1833 


zf-CCHC 


Zinc knuckle 


2.1e-06 


34.7 


1 


137-154 


urn 

X OJJ 


TVt 


Pavptqp transcriptase 


7.7e-06 


25.9 


1 


84-277 


1 OAA 
loir 




Trhiniiitin carhriYvl -terminal hydrolase 

V UlUlUUil VtUUWAJfl VwA IIIIIIHI UTVMWAMOW 

family 


0.15 


-8.5 


1 


165-238 


1R4fi 




ArmaHilln/heta-catenin-like reneat 


0.28 


17.7 


2 


50- 

91:92- 

132 


1860 


zf-CCHC 


Zinc knuckle 


3.2e-05 


30.8 


1 


179-196 


1864 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
linger) 

— 


0.0022 


23.3 


1 


218-256 


1887 


ig 


Immunoglobulin domain 


4e-08 


40.4 


1 


35-112 


1889 


LRR 


Leucine Rich Repeat 


0.051 


20.1 


1 


62-85 


1895 


maseH 


RNaseH 


3.4e-06 


25.8 


1 


47-177 


1899 


Brevenin 


Brevenin/esculentin/gaeg^nin/xiigosin 
family 


7.5 


-2.9 


1 


1-51 


1911 


UPAR LY6 


u-PAR/Ly-6 domain 


1.3e-06 


35.4 


1 


44-117 
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ID 

NO: 


Pfam TVTnHol 


A/cairip uuu 


E-value 


Score 


No: of 

Pfam 

Domains 


Position 
of the 
Domain 


101 1 


trtvin 
lUAlii 


Qnalrp trwin 

OUcL&C IUA111 


3 


-19.5 


1 


66-117 


1011 


r\viiviii rcup 


Ar»rivin tvnpQ T and TT receotor domain 

LI V 111 tjrjJCO 1 <Ulvi XX iwu^j/wi uviiium 


9.5 


-14.0 


1 


30-118 


1 Ol 9 


rvp 


P ptmviT5i1 acnartvl nroteaSfi 
JVC U. u v 11 ax aoyaxiy x |7iwtwc»ow 


7 


-26.3 


1 


42-142 


1 01 1 


QAM 


cam ^ATrinin f^ltprilp alnha motifs 

OxtJlVI nuiiiaiii 1 OIC111C cxxyjxxa xxxkjixxj 


3.9e-13 


57.1 


2 


105- 

170:183- 
247 


1Q1 f\ 

I7IO 


O cilia 


Sema domain 


1.4e-14 


54.6 


I 


51-434 


1Q96 


PAP2 

1 AT i» 


PAP? ^imerfamilv 


2.9e-07 


37.6 


I 


48-142 


1010 

17JV 




Immunoglobulin domain 


2.7e-07 


37.6 


I 


41-116 


1Q^S 


n/p 
1 vc 


Integrase core domain 


2.5e-13 


57.7 


I 


1-138 


1940 


rnaseH 


RNase H 


l.le-26 


102.0 




24-153 


1940 


Integrase Zn 


Integrase Zinc binding domain 


4.7e-12 


53.5 


1 — : 


155-194 


1952 


LRJRNT 


Leucine rich repeat N-tenninal domain 


0.0027 


24.4 




67-95 




ui^_con 


u Diqmnn-coniugaung enzyme 


2 Re-08 


40.9 




78-219 




reptiaase jyliu 


jviaiiixin 


6 7e-86 


298.8 




53-212 




TTl /. 


p lDronccuii type xx uuiiiaiju 


le-79 


278.2 


3 


231- 

272:289- 
330:347- 
388 


1958 


ras 


Ras familv 


1.9 


-132.0 


1 


215-284 


1963 


tsp 1 


Tlirombospondin type 1 domain 


0.083 


8.0 




20-63 


1966 

X Jr \J\J 


rvt 


Reverse transcriptase 


1.5e-05 


21.9 




2-196 


1968 


O-nateh 


G-natch domain 


0.3 


6.0 


I 


307-352 


1968 


rvp 


Retroviral aspartyl protease 


1.4 


-19.9 


j 


274-385 


xy i\j 


1 VC 


TnfptnfiQp potp domain 
mibgidow wiw \x\jxxxaxxi 


0.78 


-16.8 




265-395 




PViorrA intporacp 
rHagC JLULCglObC 


Pliacrp intporjicp familv 
JT liagc uiicgiaoc x&ixixiy 


5.7e-08 


39.9 




1-153 


1974 


Sigma54 activat 


Sigma-54 interaction domain 


3.1e-37 


137.2 


1 


63-253 


107^ 
iy 1 3 


IN a Jrl COTrallS 


IN tTw IT 1-CO liallopUl LCI 


0 0085 


-99.2 




1-146 


\y fj 


signal 


xiis fwinase a umospuo acceptor/ 
domain 


7 


-7 7 


1 


85-147 


1978 


UPAR LY6 


u-PAR/Ly-6 domain 


1.8 


-16.0 




21-96 


iy /o 


ZjH_cius 


rungdi z*n^z^-\^y o\\jj L7iiiueicai uiuaici 
domain 


5 1 

•/.I 


-5.7 




21-60 


1QB7 
I70 / 


pro lsomerase 


oyciopiiiiiii iypc pcpiiLiyi-piuijri eia-u 


1.2e-18 


75.4 


1 


4-171 


10Q7 
1t7 / 


~f ppup 
zi-V-/V/ri^ 


ZvlIlL KJLLULlvIC 


1.9e-05 


31.5 


2 


181- 

198:204- 
220 


1997 

177 / 


TF1TD-31 


Transcription mitiation factor UD. 
31kD su 


7.9 


-63.3 


1 


75-187 


1997 

177 / 


Gao: nl2 


Gag polyprotein, inner coat protein p 12 


8.9 


-9.5 


I 


155-229 


1998 


KRAB 


KRAB box 


2e-23 


91.2 


\ 


27-65 . 


2001 

£*\J\J 1 


CH 


Calnonin homoloev fCID domain 


0.019 


10.8 




230-330 


2001 


SAM 


SAM domain (Sterile alpha motif) 


0.9 


6.5 


j 


248-311 


900R 


ten 1 

lop 1 


TTirnrnhn<5nonHiTi tvne 1 domain 


0.013 


15.1 




64-98 


901 1 




TmtniiTi ftplfihiil in dfYmain 


1.7e-05 


31.7 




186-255 


901 1 
/\)i x 


kazal 


TiTii^al-tvnp cprinp r»rntpa<!P inhihitfir 

IVdMl IYUC oCllUC piVILCOOC IlllUUIlUl 

domain 


0 00028 


27.6 




121-168 


901 1 


XKJT XJX 


Tn^itlin-lilrp ornwtVi factor binding 

1 1 i>^\4m 11 1 IV v eLhL^/VTUA IUv Ivl Till 1 ■ II " 

protein 


0.17 


2.5 




53-113 


2011 


zf-UBRl 


Putative zinc finger in N-recognin 


8.3 


-24.0 




54-112 


2015 


PH 
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MINIMIZED AVERAGE 
STRUCTURE) IHSM 4 


DNA-BINDING HIGH 
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PMF 
score 




79.57 




69.17 




SEQFOLD 
score 


CARBONYL REDUCTASE; 
CHAIN: A, B, C,D; 

GDP-MANNOSE 4,6- 


CARBONYL REDUCTASE; 
CHAIN: A, B, C,D; 




CIS-BIPHENYL-2,3- 
DIHYDRODIOL-2,3- 
DEHYDROGENASE; CHAIN: 
NULL; 


CIS-BIPHENYL-2,3- 
DIHYDRODIOL-2,3- 
DEHYDROGENASE; CHAIN: 
NULL; 


CHAIN: A, B; 


Compound 


OXIDOREDUCTASE SHORT- 
CHAIN DEHYDROGENASE, 
OXIDOREDUCTASE 
LYASE DEHYDRATASE. NADP. 


OXIDOREDUCTASE SHORT- 
CHAIN DEHYDROGENASE, 
OXIDOREDUCTASF. 


DEHYDROGENASE, PCB 
DEGRADATION 


OXIDOREDUCTASE NAD- 
DEPENDENT 

OXIDOREDUCTASE, SHORT- 
CHAIN ALCOHOL 2 


OXIDOREDUCTASE NAD- 
DEPENDENT 

OXIDOREDUCTASE, SHORT- 
CHAIN ALCOHOL 2 
DEHYDROGENASE, PCB 
DEGRADATION 


DEHYDROGENASES/REDUCTAS 
ES, TERNARY COMPLEX, NAD- 
3-PENTANONE 4 ADDT IPT 


OXIDOREDUCTASE, 
DETOXIFICATION, 
METABOLISM, ALCOHOL 2 
DEHYDROGENASE, 
DROSOPHILA LEBANONENSIS, 
SHORT-CHAIN 3 
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SINGLE-CHAIN ANTIBODY 
FRAGMENT; CHAIN: A, C; 


HYDROLASE(O-GLYCOSYL) 
N9 NEURAMINID ASE-NC4 1 
(E.C.3.2.1.18) COMPLEX WITH 
FAB INCA 3 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN FAB 
FRAGMENT (MC/PCS603) 
IMCP 4 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN FAB 
FRAGMENT (MC/PCS603) 
IMCP 4 


SYNONYMS: L5MK16 
DIABODY, SINGLE-CHAIN FV 
DIMER 1LMK 4 


IMMUNOGLOBULIN ANTI- 
PHO SPHATID YLINO SITOL 
SPECIFIC PHOSPHOLIPASE C 
DIABODY UMK 1 




IGG2A INTACT ANTIBODY - 
MAB231: CHAIN: A.B.C.D 


m) IIGC 5 PROTEIN G, 
STREPTOCOCCUS IIGC 15 


Compound 


IMMUNOGLOBULIN VARIABLE 
HEAVY (VH) DOMAIN, 
VARIABLE UGHT (VL) 
ANTIBODY FRAGMENT, 










IMMUNOGLOBULIN INTACT 
IMMUNOGLOBULIN V REGION 
C REGION, IMMUNOGLOBULIN 
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HTV-l REVERSE 
TRANSCRIPTASE (CHAIN A); 
CHAIN: A; HTV-l REVERSE 
TRANSCRIPTASE (CHAIN B); 


HTV-l REVERSE 
TRANSCRIPTASE (CHAIN A); 
CHAIN: A; HTV-l REVERSE 
TRANSCRIPTASE (CHAIN B); 
CHAIN: B; ANTIBODY (LIGHT 
CHAIN); CHAIN: L; ANTIBODY 
(HEAVY CHAIN); CHAIN: H; 
DNA (5'- CHAIN: T; DNA (5*- 
CHATN: P; 


HTV-l REVERSE 
TRANSCRIPTASE (A-CHAIN); 
CHAIN: A; HTV-l REVERSE 
TRANSCRIPTASE (B-CHAIN); 
CHAIN: B; 


HTV-l REVERSE 
TRANSCRIPTASE (A-CHAIN); 
CHAIN: A; HTV-l REVERSE 
TRANSCRIPTASE (B-CHAIN); 
CHAIN: B; 




IGGl ANTIBODY 32C2; CHAIN: 
A; IGGl ANTIBODY 32C2; 
CHAIN: B; 


FAB) 2FGW 4 


Compound 


TRANSFERASE/IMMUNE 
SYSTEM/DNA HTV-l RT; HTV-l 
RT; HTV, REVERSE 
TRANSCRIPTASE, MET184ILE, 


TRANSFERASE/IMMUNE 
SYSTEM/DNA HTV-l RT; HTV-l 
RT; HTV, REVERSE 
TRANSCRIPTASE, MET184TLE, 
3TC, PROTEIN-DNA 2 COMPLEX, 
DRUG RESISTANCE, Ml 841, 
TRANSFERASE/IMMUNE 3 
SYSTEM/DNA 


TRANSFERASE HTV-l REVERSE 
TRANSCRIPTASE, AIDS, NON- 
NUCLEOSBDE INHIBITOR, 2 
DRUG DESIGN 


TRANSFERASE HTV-l REVERSE 
TRANSCRIPTASE,- AIDS, NON- 
NUCLEOSIDE INHIBITOR, 2 
DRUG DESIGN 




IMMUNE SYSTEM FAB, 
ANTIBODY, AROMATASE, P450 
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SEQFOLD 
score 


FIBRINOGEN (ALPHA CHAIN); 


FIBRINOGEN (ALPHA CHAIN); 
CHAIN: A, D, N, Q; 
FIBRINOGEN (BETA CHAIN); 
CHAIN: B, E, 0,R; 
FIBRINOGEN (GAMMA 
CHAIN); CHAIN: C, F, P, S; 
FIBRINOGEN; CHAIN: M, Z; 




UBIQUITIN CORE MUTANT 
1D7; CHAIN: A; 


CHROMOSOMAL PROTEIN 
UBIQUITIN 1UBI 3 


UBIQUITIN TETRAUBIQUITIN 
1TBE3 


1D8 UBIQUITIN; CHAIN: A; 


UBIQUITIN-LIKE PROTEIN 7, 
RUBl; CHAIN: A; 






Compound 


BLOOD CLOTTING COILED- 


BLOOD CLOTTING COILED- 
CODL 




UBIQUITIN UBIQUITIN, 
DESIGNED CORE MUTANT 






DE NOVO PROTEIN PROTEIN 
DESIGN, HYDROPHOBIC CORE, 
PACKING, ROTAMERS, ROC, 2 
UBIQUITIN, DE NOVO PROTEIN, 
UBIQUITIN 


SIGNALING PROTEIN RUB 1, 
UBIQUITIN-LIKE PROTEIN, 
ARABIDOPSIS, SIGNALING 
PROTEIN i 




HETERONUCLEAR NMR 
SPECTROSCOPY, 3 
VIRUS/VIRAL PROTEIN 
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SEQFOLD 
score 


FIBRINOGEN; CHAIN: A, B, C, 
D,E,F,S,T,M,N; 


FIBRINOGEN-420; CHAIN: A, 
B,C,D,E,F,G,H; 


P 3 
Jflg 

^ 1 
> 

w 
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p 
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FIBRIN; CHAIN: A, B, C, D, E, F, 
G,H,I,J; 


FIBRIN; CHAIN: A, B, C, D, E, F, 
G,H,U; 


FIBRIN; CHAIN: A, B, C, D, E, F, 
G,H,U; 


CARBOXYL TERMINAL 
FRAGMENT; CHAIN: NULL; 


Compound 


BLOOD COAGULATION BLOOD 
COAGULATION, PLASMA, 
PLATELET, FIBRINOGEN, 


BLOOD COAGULATION BLOOD 
COAGULATION, FIBRINOGEN- 
420, ALPHAEC DOMAIN, 2 
FIBRINOGEN RELATED 
DOMAIN, GLYCOSYLATED 
PROTEIN 


BLOOD COAGULATION BLOOD 
COAGULATION, PLASMA 
PROTEIN, CROSSLINKING 


BLOOD COAGULATION BLOOD 
COAGULATION, PLASMA 
PROTEIN, CROSSLINKING 


BLOOD COAGULATION BLOOD 
COAGULATION, PLASMA 
PROTEIN, CROSSLINKING 


BLOOD COAGULATION BLOOD 
COAGULATION, PLASMA 
PROTEIN, CROSSLINKING 


FACTOR BLOOD 
COAGULATION, 
GLYCOPROTEIN, CALCKJM, 
PLATELET, PLASMA, 2 
ALTERNATIVE SPLICING, 
SIGNAL, DISEASE MUTATION, 3 
POLYMORPHISM 


PDB annotation 



WO 03/080795 



319 



PCTYUS02/25485 



to 



00 



4^ 

tb 

O 



o 
oo 




> O 

a > hi xk 

> r ^ w 

on>P 



to 



P3 

o 

ON 



tO 



CO 

to 



o 



p 

oo 





I 1 

0Q 



w 



00 
On 



5 



O 

o 



o 



8 



to 
c/» 



oo 

ON 



VO 
Ch 



o 



R 1 

0Q 



O 



to 



OO 

o 
o 



to 



o 
o 



§93 



w 

O 




Sow 






WO 03/080795 



320 



PCT/US02/25485 



» — ' 

to 



»— ' 
to 



2 ^ ^ 



N 



s 



5 



to 



to 



-4 



oo 



00 



to 
o 

I 

o 
oo 



VO 

bs 

CD 
I 

I— I 

to 



to 



? 

to 



g 



W CO 



o 

VO 



o 



o 



ft 



6 

1>J 



CO < 

2 » 
o *«| 

s: 



p 
vo 



o 
to 

00 



-J 



o 

3 



w 

I 





o 

o 

I 

o 
ts 

D 





I 

p 

D 

a 
© 

tt 

o 
0 



WO 03/080795 



321 



PCT/US02/25485 




WO 03/080795 



322 



PCT/US02/25485 



to 

to 



«o 
to 



-J 
>— » 

ON 



-o 

On 



-J 

ON 



§91 



5 



I 



5 



ON 
VO 



ON 
ON 



to 
O 



to 

s 



to 
to 



VO 
On 



CD 

to 
-o 



00 

o 

6 

00 



oo 

? 

to 



CD 
Os 



o 



o 
On 



s s 

3^ 



oo 

VO 



p 

I— » 
LA 



o 

On 



o 
b 

ON 



p 
i— * 



© 



1 



H 

I 

CD 



o 




w 





o 

o 

I 

o 

§ 



r 

i 

en 

8 





I 

p 

B 

o 

so 
tt 
o 
0 



WO 03/080795 



PCT/US02/25485 



323 



ro 



«o 
to 



Z w & 

9*% 



o* 



cr 



88 



0 



oo 



to 



to 



o 



to 



ON 



ON 



? 

to 



o 
o 



oo 

I 



o 



o 

VO 



g 



« < 

O 

"i s: 
A «5» 



to 



to 



CM 

n 
o 

fl> 



VO 

to 

OS 



w 

! 



H 
8- 



o ^ 2 ^ 





oo 



o 

I 

o 




r§2 






GO 



§ 

to 
S 
B 
O 

a 

o 

B 



WO 03/080795 



324 



PCT/US02/25485 



3 M £2 



3 

to 



3 

to 



ft 



5 



w 



e 



On 



00 

vo 



o 



n. 



to 
to 



On 
4^ 



to 



ON 



to 



ON 



8 a. 



to 
vo 



to 



to 



§ a 

3 «J? 



o 



o 



O 
VO 



p 

I— » 



o 
o 

3 



8 

JO 

O 



8 

I 



CO 

p 
o 






o 

e 

I 
o 
c 
p 
a 





2Qwd 



3 5? o 
Q > g 

3 O o 




8: 

o 

% 




W > 

ii 



I 

I 

o 

I 

o 
0 



WO 03/080795 



PCT7US02/25485 



325 



to 



k3 



m £2 



a 



3 

to 



3 

to 



el 



o 



o 



o 



5 



o 



oo 



to 



00 



oo 

? 

to 
to 



oo 

? 

to 
to 



ON 

? 



2^ 

w -- 



p 

J— » 

VO 



o 
b 



p 



o 
o 



CO < 



o 

O 



ON 

to 



CO 

S 



co - 

is 

6 



I 

CD* 






n 

o 

i 

o 
e 

S3 





oos 




I 

JO 

a 

O 

a 

o 

S3 



WO 03/080795 



326 



PCT/US02/25485 



to 



to 



o 
to 



to 



leg 



& 

oo 



oo 



2 



to 

-4 



to 
u> 
-4 



to 



O 



00 



u> 



o 
o 
o 

VO 



CD 



4^ 



to 
oo 



00 



to 



to 
to 



O 



2 



p 



to 



o 
o 

VO 



p 
oo 



o 
"to 



ON 



o 

b\ 
vo 





- ~ ^ CO 

• • E Q G o 
52 




WO 03/080795 



327 



PCT7US02/25485 




WO 03/080795 



328 



PCI7US02/25485 



w 



ON 



00 



5! 



no 




N3 
ON 



ON 



o 



W 



to 



i 



u> 

U> 



oo 

? 



to 



O 



to 



ON 



00 



VO 

p 

*— » 

VO 




8 
p 




5 & 




CO 



3 



g 
I 



WO 03/080795 



330 



PCT/US02/25485 



•O 



S3 



-J 
-O 



^ M g 



cr 



92 



o 



w 



on 
vo 



ON 

oo 



On 
VO 



ON 



on 

00 



ON 



to 

ON 



to 
On 

NJ 



to 

ON 



to 



to 
On 



to 



CD 
I 

ON 

to 



to 



to 

i 

to 



o 

I 

oo 



P " 



o 

VO 



o 



o 



o 

ON 



8 



O 
O 



o 
vo 

VO 



o 
o 



22 htf 
3 5 



On 

ON 

bo 



CO 

3 8 

S 



o 




00 




I 



00 



n 

o 

I 

o 
c 
a 
a 



O 
W 

p 

B 
B 
O 

P 
ft 

o 

B 



WO 03/080795 



PCT/US02/25485 



331 



9 9 £ 



c 



ON 



CL 



3 



00 



oo 
On 



On 



VO 



to 

ON 



i 

o 



? 



no 

V 09 
GO 



Z < 
? 3. 



o 
o 



O 



to 
45. 



o ^ 

3 O 

5 



a 



o 

o 

1 

o 
p 




a a 
O O 

00 00 

•-a T> 

a a 
o o 



h3 13 ffi 

a a 3 

o o 9 

oo oo 55 
S 

a a Q 

o o 



W5i 





i 

p 
p 

o 

a 

o 
p 



WO 03/080795 



332 



PCT/US02/25485 




WO 03/080795 



PCTYUS02/25485 



333 



U> 
4^ 



-J 



2 N 2 

9^g 



3? 



J? 



•3 



w w 



9 



o 



S3 



o 

ON 



vo 
o 



to 

00 



VO 



oo 



oo 

I 

o 



? 



VO 



p 

-J 



o 



o 
to 



^ si 



o 

VO 
VO 



8 



O 



o ^ 
5 o 





6 



> 00 



o 

o 

I 

o 
e 

D 




O 



3i 



51 



o 



•zap 



3 Be 
8 8g 

- 1 





o o c > 
oo 




oo 



© 

B 
B 
O 
<■+ 

o 
B 



WO 03/080795 



334 



PCT/US02/25485 



i — » 



oo 

? 




oo 
4^ 



to 
VO 



O 

o 
o 



VO 



o 
o 



4^ 



lO 
OO 



to 
On 



O 
O 



4^ 



to 

vo 



On 
4^ 



O 



to 
to 
to 



a 



-o 



2-2 



VO 



VO 

to 



o 



vo 

VO 



s 



OO 

*o 



oo 

lb 



© 



o 
vo 



is 



9 








WO 03/080795 



335 



PCT/US02/25485 




WO 03/080795 



336 



PCT/US02/25485 




WO 03/080795 



337 



PCT/US02/25485 




WO 03/080795 



338 



PCT/US02/25485 




WO 03/080795 



339 



PCT/US02/25485 




WO 03/080795 



340 



PCT/US02/25485 



5J 



5! 

O 



O 

5? 



-0 
U> 



6 



2 M £2 



5 



on 



to 



N) 



8 



ON 



SI 



o 
b 
o 
o 



? 

o 
oo 



2* 

JO us 



O 



© 
I— » 



© 

b 



o *i 



to 



© 
ON 



3 



I 

CD* 



2 ^^5>>§^ w ii^w 

Br 8 si Sl^ll 

j !> > £ 



!Q 



r P 



o 

o 

I 
o 
c 
p 
a. 





W 

w 




S 



a 

o 



WO 03/080795 



341 



PCT/US02/25485 



•£> 


1 i— i 
1 -J 
> Ol 
4> 


5 






5 
c 


> c 


i ^ 
> c 


> 


9 °o 


& 
p. 




»— 


a 




a 


I H 


: 5 


t 




CO 


> 




> 






> 


> 




CHAIN 
ID 


>— » 


*> 

s> 


-J 


VO 




u 


o 


»— • 

UJ 

i— • 




START 
AA 


o 

Ol 


VO 
VO 


00 


VO 




to 


to 

£ 


lo 

£ 






9.6e-17 


3.2e-ll 


0.004 


00 
CP 




4.5e-17 


9e-12 


1.3e-16 




Psi 
Blast 


-0.33 


0.05 


-0.72 


-0.20 




0.59 


0.03 


0.37 




Verify 
score 


0.30 


0.12 


0.07 


0.28 




0.82 


0.13 


0.75 




PMF 
score 




















SEQFOLD 
score 


RAB-3A; CHAIN: A; 
RABPHILIN-3A; CHAIN: B: 


CHAIN: A; 


PHOSPHATIDYLINOSITOL-3- 
PHOSPHATE BINDING FYVE 


PROTEIN KINASE C DELTA 
TYPE; 1PTQ4 


HEPATOCYTE GROWTH 
FACTOR-REGULATED 
TYROSINE CHAIN: A; 




HYDROLASE(ENDORBBONUC 
LEASE) RIBONUCLEASE H 
(E.C.3.1.26.4HRIL3 


HYDROLASE(ENDORIBONUC 
LEASE) RIBONUCLEASE H 
DOMAIN OF /HTV-1$ REVERSE 
TRANSCRIPTASE IHRH 3 


RIBONUCLEASE HI; CHAIN: A; 




Compound 


COMPLEX (GTP- 
BINDING/EFFECTOR) RAS- 


TRANSPORT PROTEIN FYVE 
DOMAIN, ENDOSOME 
MATURATION, 
INTRACELLULAR 
TRAFFICKING, 2 TRANSPORT 
PROTEIN 


PHOSPHOTRANSFERASE 


TRANSFERASE HRS; HRS, VHS, 
FYVE, ZINC FINGER, 
SUPERHELDC ! 








HYDROLASE RNASE H, 
NUCLEASE, RNASE H\ 
RIBNUCLEASE H, METAL- 
BINDING 2 PROTEIN, PROTEIN 
FOLDING 


BINDING 2 PROTEIN, PROTEIN 
FOLDING 


PDB annotation 



WO 03/080795 



342 



PCT/US02/25485 




WO 03/080795 



PCIYUS02/25485 



343 



*-0 
VO 



VO 



-4 

VO 



oo 



-o 
oo 



-o 
oo 



5! M u 



3> 



O 
N> 



o 



9 



On 



o 



o 

-o 



On 
to 



ON 

to 



On 



to 
to 



to 
4^ 



t 



ON 

to 



to 

CD 



u> 
to 

CD 



4^ 
? 

to 

OO 



On 

? 
to 



00 

o 
do 



2^ 



O 

oo 



o 



w < 
3 



o 

oo 



o 



p 



8 a 






O £ 



o 





n 

o 





is 




CO 



3 



7* 



1 

P 

B 
O 

sr 

a 

o 

D 



to 



WO 03/080795 



PCTYUS02/25485 




WO 03/080795 



345 



PCT/US02/25485 



VO 



VO 



*o 
vo 

ON 



vo 
o\ 



vo 



CO 



MS 



o 
o 



s 



cr 

VO 



VO 



9 2 



9 



to 



ON 



to 



N> 
oo 



to 
-o 



to 
vo 



lo 
vo 



to 

vo 



oo 

? 

to 



vo 
on 



4^ 
to 

ON 



to 

CD 
i 

to 

VO 



2* 



b 



p 
o 



o 

1/1 



o 
b 

ON 



09 <J 

2 » 



o 



o 



p 



o 

u> 



W 

s o 

o ^ 

i 



I: 

CD* 



CO 

W 
ft 



> 

P 



CO 



w 

O 



CO 



W 

Q 



CO 

W 



O 



O 



ft 
o 

I 

o 
c 
p 
o. 



W 

ft 



awe; 

co Q co 



00 






9 

I 

o 



Eg 



CO 

W 



00 



3 

W 

P 

p 



p 

a 

o 
p 



CO 

W 



WO 03/080795 



PCT/US02/25485 




WO 03/080795 



347 



PCT/US02/25485 




WO 03/080795 



348 



PCT/US02/25485 



oo 
to 



oo 

t— ' 



00 
tO 



2 M 23 



eg 



0 



5 



v© 



On 



*5 



to 

ON 
4^ 



to 



I— k 
to 



4^ 
CO 
I 

u> 



to 



ON 



ON 



I* 



o 
oo 



On 



2 < 

8 5 

3 <^ 



VO 
ON 



O 

VO 
ON 



O 
VO 
ON 



s o 

o ^ 

3 2 



CD 






n 

o 

I 

o 
ff 
B 



i 



WW WQ 




B8gB8 




gig 




w 

pa 
B 
B 
O 

9 
© 

B 



WO 03/080795 



PCT/US02/25485 




WO 03/080795 



PCT/US02/25485 




WO 03/080795 



351 



PCT/US02/25485 



1812 


1812 


1812 


1812 


1812 


1812 




NO: 


m 52 


likn 


lihb 


lihb 


lihb 


lihb 


O 




B§ 


O 


> 


> 


> 


> 


> 




CHAIN 
ID 


~j 

ON 






to 
vo 


s 


U> 




START 
AA 


VO 

o 


K> 
VO 


u> 

t — I 

*. 


u> 
o 


to 

OS 

oo 


to 
C\ 






1.6e-27 


l.le-25 


*. 
bo 
? 

K> 
<Ji 


1.6e-31 


6.4e-27 


6.8e-22 




Psi 
Blast 


0.07 


-0.11 


0.16 


0.40 


0.34 


0.25 




Verify 
score 


-0.18 


0.63 


0.15 


0.95 


1.00 


0.69 




PMF 
score 
















SEQFOLD 
score 


NF-KAPPA-B P65 SUBUNTT; 
CHAIN: A; NF-KAPPA-B P50D 


CYCLIN-DEPENDENT KINASE 
6 INHIBITOR; CHAIN: A, B; 


CYCLIN-DEPENDENT KINASE 
6 INHIBITOR; CHAIN: A, B; 


CYCLIN-DEPENDENT KINASE 
6 INHIBITOR; CHAIN: A, B; 


CYCLIN-DEPENDENT KINASE 
6 INHIBITOR; CHAIN: A, B; 


PYK2-ASSOCIATED PROTEIN 
BETA; CHAIN: A; 


I 

CD 
> 


Compound 
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ALPHA, BETA T-CELL 
RECEPTOR CHAIN: A, B; 


RECEPTOR BETA CHAIN; . 
CHAIN: E; 


Compound 


SIGNAL TRANSDUCTION SON 


SIGNALING PROTEIN ARFl 
GUANINE NUCLEOTIDE 
EXCHANGE FACTOR AND PH 
DOMAIN ! 


I SIGNALING PROTEIN 11 ALPHA- 
HELICES 


GENE REGULATION SON OF 
SEVENLESS PROTEIN; GUANINE 
NUCLEOTIDE EXCHANGE 
FACTOR, GENE REGULATION 


GENE REGULATION SON OF 
SEVENLESS PROTEIN; GUANINE 
NUCLEOTIDE EXCHANGE 
FACTOR, GENE REGULATION 


TRANSPORT PROTEIN RHO- 
GTP ASE EXCHANGE FACTOR, 
TRANSPORT PROTEIN 


SIGNAL TRANSDUCTION 
SIGNAL TRANSDUCTION, SOS, 
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DOMAIN 




RECEPTOR TCR; T-CELL, 
RECEPTOR, TRANSMEMBRANE, 
GLYCOPROTEIN, SIGNAL 
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PEPTIDE RECOGNITION 
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PROTEIN LOCALIZATION 
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NMR, RECEPTOR 
OLIGOMERIZATION, EPH 
RECEPTORS, TYROSINE 2 
PHOSPHORYLATION, SIGNAL 
TRANSDUCTION, TYROSINE- 
PROTEIN 3 KINASE 


TYROSINE-PROTEIN KINASE 
NMR, RECEPTOR 
OLIGOMERIZATION, EPH 
RECEPTORS, TYROSINE 2 
PHOSPHORYLATION, SIGNAL 
TRANSDUCTION, TYROSINE- 
PROTEIN 3 KINASE 
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DOMAIN, EPH RECEPTOR, 
SIGNAL TRANSDUCTION, 
OLIGOMER 
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SIGNAL TRANSDUCTION, 
OLIGOMER 
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Blast 


0.23 
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Verify 
score 


-0.02 


-0.06 


0.09 




! c 


> 


PMF 
score 








53.60 






SEQFOLD 
score 


NERVE GROWTH FACTOR; 
CHAIN: V, W; TRKA 




TWITCHIN 18TH IGSF 
MODULE; CHAIN: NULL: 


MUSCLE PROTEIN TTITN 
MODULE M5 (CONNECTIN) 
ITNM 3 (NMR, MINIMIZED 
AVERAGE STRUCTURE) ITNM 
4 ITNM 58 




THROMBIN; CHAIN: L, H, J, K; 
RHODNIIN; CHAIN: R, S: 


TITIN; CHAIN: NULL; 


IMMUNOGLOBULIN 
HETEROLOGOUS LIGHT 
CHAIN DIMER IMCW 3 
(/MCG$-/WEIR$ HYBRID) 
IMCW 4 


Compound 


NERVE GROWTH 
FACTOR/TRKA COMPLEX 


MUSCLE PROTEIN 
IMMUNOGLOBULIN 
SUPERFAMILY, I SET, MUSCLE 


i 

c 
S 

c 

5 


COMPLEX (SERINE 
PROTEASE/INHIBITOR) 
COMPLEX (SERINE 
PROTEASE/INHIBITOR), KAZAL- 

TVPR TMTTTOTTrkD t Tiro numm 


MUSCLE PROTEIN CONNECTIN 
NEXTM5; CELL ADHESION, 
GLYCOPROTEIN, 
TRANSMEMBRANE, REPEAT, 
BRAIN, 2 IMMUNOGLOBULIN 
FOLD, ALTERNATIVE SPLICING, 
STOW A T 7 mt Terr r? dd /"vt^tjtkt 
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© 

00 
00 


S 

VO 




8 


I o 
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Is) 




PMF 
score 








63.96 










SEQFOLD 
score 




CALMODULIN; CHAIN: A- 
RS20; CHAIN: B; 


CALCIUM-BINDING PROTEIN 
RAT ONCOMODULIN 1RR0 1 




PHOSPHOLIPASE C DELTA-1- 
CHAIN: NULL: 




PHOSPHOLIPASE C DELTA-1- " 
CHAIN: NULL: 


CALCIUM-BINDING PROTEIN 
NCS-1; CHAIN: A; 


CALMODULIN; CHAIN: A; 




Compound 


CALMODULIN, CALCIUM 
BINDING, HELIX-LOOP-HELIX 
SIGNALLING, 2 

COMPLEX(CALCIUM-RTr>mT\rn 


1 

t 

5 

0 
t3 


SIGNAL TRANSDUCTION 

PROTEIN PLECKSTRIN, 

PHOSPHOLIPASE, INOSITOL 

TRISPHOSPHATE, 2 SIGNAL 

TRANSDUCTION PROTEIN, 
irvTYDrkT a ct? 


METAL BINDING PROTEIN 
YEAST FREQUENIN EF-HAND, 
CALCIUM 

SIGNAL TRANSDUCTION 
PROTEIN PLECKSTRIN, 
PHOSPHOLIPASE, INOSITOL 
TRISPHOSPHATE, 2 SIGNAL 
TRANSDUCTION PROTEIN, 

TTVTYDrYT A or? 


TRANSPORT PROTEIN 
CALCIUM BINDING, EF HAND, 

FHTTP-TTRT TV DTTXTTvr tj 


HAND CALCIUM-BINDING 
PROTEIN, PROTEIN- 2 
COELENTERAZINE PEROXIDE 
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9.6e-24 


1.6e-23 


1.7e-16 


6.8e-22 


6.8e-10 


9.6e-08 


1.7e-22 




Psi 
Blast 


0.41 


0.57 


0.43 


0.17 


0.18 


0.11 


0.46 




Verify 
score 


0.92 


0.66 


0.16 


0.98 


0.63 


0.87 


1.00 




PMF 
score 


















SEQFOLD 
score 


INTERNALIN B; CHAIN: A; 


INTERNALIN B; CHAIN: A; 


U2 RNA HAIRPIN IV; CHAIN: 
Q, R; U2 A'; CHAIN: A, C; U2 B"; 
CHAIN: B, D; 


U2 RNA HAIRPIN IV; CHAIN: 
Q, R; U2 A'; CHAIN: A, C; U2 B"; 
CHAIN: B, D; 


U2 RNA HAIRPIN IV; CHAIN: 
Q, R; U2 A'; CHAIN: A, C; U2 B"; 
CHAIN: B,D; 


U2 RNA HAIRPIN TV; CHAIN: 
Q, R; U2 A'; CHAIN: A, C; U2 B n ; 
CHAIN: B, D; 


U2 RNA HAIRPIN IV; CHAIN: 
Q, R; U2 A'; CHAIN: A, C; U2 B"; 
CHAIN: B,D; 


Q, R; U2 A'; CHAIN: A, C; U2 B"; 
CHAIN: B, D; 


Compound 


CELL ADHESION LEUCINE RICH 
REPEAT, CALCIUM BINDING, 


CELL ADHESION LEUCINE RICH 
REPEAT, CALCIUM BINDING, 
CELL ADHESION 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), RNA, 
SNRNP.RIBONUCLEOPROTEIN 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), RNA, 
SNRNP,RIBONUCLEOPROTEIN 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), RNA, 
SNRNP.RIBONUCLEOPROTEIN 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), RNA, 
SNRNP.RIBONUCLEOPROTEIN 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), RNA, 
SNRNP.RIBONUCLEOPROTEIN 


PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), RNA, 
SNRNP.RIBONUCLEOPROTEIN 
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Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1042 


28 


0.969 


0.829 


1043 


19 


0.891 


0.574 


1044 


26 


0.953 


0.774 


1045 


13 


0.891 


0.675 


1046 


19 


0.987 


0.941 


1047 


24 


0.969 


0.817 


1048 


11 


0.953 


0.814 


1049 


17 


0.923 


0.602 


1050 


26 


0.977 


0.685 


1051 


39 


0.978 


0.765 


1052 


22 


0.982 


0.918 


1053 


15 


0.989 


0.965 


1054 


24 


0.912 


0.655 


1055 


31 


0.885 


0,603 


1056 


27 


0.924 


0.593 


1057 


14 


0.907 


0.696 


1058 


22 


0.945 


0.759 


1059 


29 


0.917 


0.690 


1060 


21 


0.973 


0.669 


1061 


19 


0.891 


0.574 


1062 


16 


0.924 


0.790 


1063 


16 


0.951 


0.883 


1064 


23 


0.913 


0.702 


1065 


27 


0.948 


0.670 


1066 


17 


0.903 


0.714 


1067 


20 


0.923 


0.683 


1068 


18 


0.987 


0.939 


! 1069 


16 


0.969 


0.904 


1070 


19 


0.991 


0.955 


1071 


31 


0.969 


0.810 


1072 


17 


0.926 


0.683 


1073 


22 


0.956 


0.916 


1074 


20 


0.989 


0.903 


1075 


15 


0.899 


0.790 


1076 


15 


0.990 


0.963 


1077 


25 


0.901 


0.586 


1078 


13 


0.908 


0.661 


1079 


20 


0.901 


0.669 


1080 


17 


0.963 


0.692 


1081 


13 


0.891 


0.675 


1082 


20 


0.944 


0.831 


1083 


17 


0.961 


0.880 


1084 


34 


0.888 


0.611 


1085 


26 


0.920 


0.700 


1086 


21 


0.948 


0.853 


1087 


28 


0.963 


0.728 


1088 


22 


0.987 


0.828 i 


1089 


22 


0.979 


0.946 


1090 


26 


0.908 


0.557 


1091 


27 


0.978 


0.831 


1092 


13 


0.971 


0.905 


1093 


19 


0.939 


0.711 


1094 


35 


0.938 


0.657 


1095 


16 


0.909 


0.828 


1096 


18 


0.937 


0.773 
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Position of Signal 

M. SJJM ill/11 Vt UlftH" 1 

Peptide 


Maximum score 


Average score 


1097 


21 


0.994 


0.969 


1098 


15 


0.949 


0.849 


1099 


27 


0.903 


0.644 


1100 


21 


0.987 


0.895 


1101 


31 


0.923 


0.626 


1102 


25 


0.986 


0.932 


1103 


33 


0.998 


0.887 


1104 


23 


0.990 


0.932 


1105 


19 


0.936 


0.685 


1106 


27 


0.910 


0.566 


i 1107 


24 


0.915 


j 0.567 


1108 


15 


0.937 


0.732 I 


1109 


21 


0.950 


0.801 


1110 


25 


0.965 


0.890 


1111 


11 


0.953 


0.814 


1112 


33 


0.963 


0.577 


1113 


20 


0.935 


0.834 


1114 


14 


0.938 


0.795 


1115 


32 


0.942 


0.655 


1116 


23 


0.957 


0.596 


1117 


19 


0.886 


0.594 


1118 


23 


0.994 


0.966 


1119 


26 


0.939 


0.810 


1120 


18 


0.930 


0.656 


1121 


22 


0.967 


0.697 


1122 


18 


0.983 


0.961 


1123 


18 


0.896 


0.737 


1124 


31 


0.932 


0.598 


1125 


23 


0.989 


0.959 j 


1126 


18 


0.960 


0.753 


1127 


23 


0.965 


0.785 


1128 


33 


0.969 


0.791 


1129 


48 


0.987 


0.614 


1130 


15 


0.975 


0.934 


1131 


20 


0.986 


0.933 


1132 


22 


0.981 


0.883 


1133 


24 


0.941 


0.732 ! 


1134 


18 


0.916 


0.728 i 


1135 


18 


0.926 


0.701 


1136 


31 


0.971 


0.816 


1137 


33 


0.937 


0.599 


1138 


27 


0.922 


0.559 i 


1139 


17 


0.948 


0.609 


1140 


24 


0.985 


0.945 


1 1141 


19 


0.881 


0.618 


i 1142 


27 


0.932 


0.726 


1143 


24 


0.977 


0.812 | 


1144 


25 


0.948 


0.848 


1145 


19 


0.973 


0.819 


1146 


20 


0.955 


0.612 


1147 


28 


0.974 


0.846 


1148 


14 


0.944 


0.864 


| 1149 


40 


0.993 


0.932 


1150 


16 


0.969 


0.912 


1151 


25 


0.927 


0.727 
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SEQ ID NO: 


Position of Signal 
Peotide 


Maximum score 


Average score 


1152 


22 


0.939 


0.684 


1153 


32 


0.925 


0.578 


1154 


21 


0.962 


0.823 


1155 


19 


0.944 


0.719 


1156 


14 


0.897 


0.638 


1159 


31 


0.982 


0.594 


1160 


29 


0.880 


0.645 


1161 


I 19 


0.970 


0.823 


1162 


23 


0.886 


0.627 


1163 


22 


0.983 


0.953 


1164 


18 


0.975 


0.858 


1166 


29 


0.924 


0.661 


1167 


31 


0.953 


0.687 


1168 


! 23 


0.967 


0.832 


1169 


18 


0.928 


0.698 


1170 


18 


0.968 


0.806 


1171 


21 


0.932 


0.654 


1172 


20 


0.932 


0.660 


I 1173 


18 


0.952 


0.791 


! 1174 


16 


0.900 


0.629 


1175 


21 


0.892 


0.786 


! 1176 


27 


0.979 


0.837 


1177 


23 


0.961 


0.663 


1178 


23 


0.974 


0.782 


1179 


40 


0.921 


0.764 


1180 


25 


0.966 


0.910 ! 


1181 


30 


0.927 


0.676 


1183 


22 


0.942 


0.807 


1184 


22 


0.971 


0.887 


1 185 


33 


0.963 


0.851 


1187 


16 


0.993 


0.954 


1188 


17 


0.940 


0.789 


1189 


18 


0.925 


0.784 


1190 


18 


0.965 


0.733 


1191 


23 


0.956 


0.636 


1192 


31 


0.992 


0.803 


1193 


25 


0.991 


0.948 


1194 


20 


0.927 


0.617 


1195 


26 


0.986 


0.895 


1196 


30 


0.889 


0.618 


1197 


23 


0.983 


0.873 


1198 


30 


0.993 


0.815 


1199 


18 


0.985 


0.956 


1201 


6 


0.885 


0.564 


1202 


28 


0.959 


0.730 


1203 


29 


0.916 


0.707 


1204 


22 


0.940 


0.800 


1205 


16 


0.888 


0.646 


1206 


21 


0.908 


0.558 


1207 


27 


0.953 


0.564 


1208 


43 


0.969 


0.757 


1209 


27 


0.965 


0.891 


1212 


19 


0.976 


0.809 


1213 


20 


0.988 


0.872 | 


1214 


31 


0.987 


0.871 
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SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1215 


18 


0.989 


0.880 


1216 


34 


0.920 


0.550 


1218 


20 


0.957 


0.870 


1219 


25 


0.928 


0.615 


1220 


18 


0.989 


0.955 


1221 


14 


0.892 


0.686 


1222 


21 


0.979 


0.940 


1223 


24 


0.979 


0.930 ' 


1224 


42 


0.983 


0.771 


1225 


22 


0.982 


0.811 


1226 


21 


0.945 


0.794 


1227 


15 


0.969 


0.910 


1229 


16 


0.916 


0.622 


1230 


29 


0.972 


0.769 


1232 


14 


0.945 j 


0.836 


1233 


30 


0.963 


0.669 


1234 


29 


0.989 


0.867 


1235 


34 


0.977 


0.891 


1236 


36 


0.934 


0.673 


1237 


32 


0.922 


0.720 


1238 


22 


0.950 


0.828 


1239 


22 


0.956 


0.763 


1240 


24 


0.981 


0.938 


1241 


19 


0.891 


0.574 


1242 


32 


0.974 


0.869 


1243 


33 


0.890 


0.675 


1244 


25 


0.934 


0.593 


1245 


22 


0.944 


0.709 


1246 


39 


0.940 


0.714 


1247 


29 


0.889 


0.658 


1248 


19 


0.883 


0.749 


1249 


24 


0.892 


0.577 


1250 


21 


0.916 


0.662 


1251 


29 


0.921 


0.601 


1252 


17 


0.954 


0.741 


1253 


27 


0.888 


0.738 


1254 


28 


0.983 


0.920 


1256 


26 


0.975 


0.705 


1257 


19 


0.914 


0.698 


1258 


18 


0.961 


0.869 


1259 


41 


0.962 


0.600 


1260 


18 


0.947 


0.664 


1261 


18 


0.946 


0.739 


1262 


20 


0.889 


0.561 


1263 


31 


0.973 


0.865 


1264 


18 


0.956 


0.850 


1265 


14 


0.952 


0.875 


1266 


29 


0.902 


0.563 


1267 


20 


0.966 


0.739 


1268 


23 


0.953 


0.688 


1269 


38 


0.919 


0.676 


1270 


27 


0.955 


0.826 


1271 


23 


0.913 


0.702 


1273 


21 


0.972 


0.915 


1274 


23 


0.950 


0.578 
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Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1275 


20 


0.996 


0.965 


1 1276 


20 


0.976 


0.937 


1278 


26 


0.962 


0.752 


1279 


38 


0.962 


0.756 


1280 


19 


0.991 


0.929 


1281 


27 


0.948 


0.670 


1282 


22 


0.932 


0.790 


1283 


23 


0.962 


0.679 


i 1285 


30 


0.888 


0.573 


| 1286 


15 


0.996 


0.988 


1287 


27 


0.992 


0.893 


1288 


24 


0.952 


0.685 


1289 


36 


0.953 


0.605 


1290 


32 


0.932 


0.649 


1291 


24 


0.990 


0.935 


1292 


24 


0.973 


0.940 


1293 


20 


0.965 


0.811 


1294 


18 


0.977 


0.957 


1296 


24 


0.987 


0.903 


j 1297 


12 


0.894 


0.780 


I 1298 


29 


0.899 


0.623 


1299 


19 


0.882 


0.753 


1300 


33 


0.996 


0.905 


1301 


21 


0.952 


0.663 


1302 


19 


0.984 


0.937 


1303 


32 


0.978 


0.885 


1305 


18 


0.985 


0.736 


1306 


46 


0.991 


0.888 


1308 


27 


0.996 


0.933 


1309 


24 


0.970 


o;9B 


1310 


27 


0.930 


0.778 


1312 


16 


0.990 ! 


0.959 


\ 1313 


18 


0.949 


0.767 


1314 


18 


0.896 


0.752 


1315 


18 


0.984 


0.888 


1316 


21 


0.953 


0.721 


1317 


35 


0.923 


0.688 


1318 


27 


0.940 


0.796 


1319 


26 


0.990 


0.837 


1320 


24 


0.972 


0.663 


1321 


18 


0.969 


0.722 


1323 


21 


0.955 


0.709 


1324 


21 


0.979 


0.935 


1325 


26 


0.944 


0.675 


1326 


29 


0.931 


0.569 


1327 


18 


0.997 


0.955 


1329 


24 


0.985 


0.845 


1330 


43 


0.901 


0.602 


1331 


32 


0.965 


0.699 


1332 


15 


0.881 


0.608 


1334 


32 


0.896 


0.556 


1335 


18 


0.963 


0.807 


1336 


19 


0.909 


0.593 


1337 


16 


0.885 


0.562 


1338 


18 


0.911 


0.688 
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Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1339 


24 


0.980 


0.847 


1340 


25 


0.943 


0.774 


1341 


20 


0.973 


0.778 


1342 


27 


0.924 


0.686 


1343 


24 


0.914 


0.585 


1344 


16 


0.957 


0.773 


1345 


15 


0.906 


0.798 


1346 


16 


0.971 


0.855 


1347 


24 


0.980 


0.901 


1348 


23 


0.965 


0.642 


1349 


22 


0.899 


0.609 


1350 


18 


0.940 


0.585 


1351 


19 


0.985 


0.935 


1352 


22 


0.945 


0.718 


1353 


20 


0.943 


0.728 


1354 


15 


0.887 


0.721 


1355 


16 


0.915 


0.737 


1358 


21 


0.948 


0.585 


1360 


30 


0.911 


0.555 


1361 


20 


0.976 


0.851 


1362 


19 


0.927 


0.791 


1364 


19 


0.947 


0.574 


1365 


28 


0.997 


. 0.786 


1366 


28 


0.979 


0.855 


1367 


22 


0.895 


0.577 


1368 


19 


0.956 


0.829 


1369 


16 


0.929 


0.739 


1370 


17 


0.931 


0.745 


1371 


30 


0.950 


0.708 


1372 


28 


0.968 


0.856 


1373 


26 


0.953 


0.711 


1375 


32 


0.983 


0.842 


1376 


19 


0.929 


0.689 


1377 


30 


0.899 


0.631 


1378 


25 


0.927 


0.775 


1379 


19 


0.982 


0.922 


1380 


28 


0.940 


0.628 


1381 


20 


0.890 


0.610 


1382 


28 


0.921 


0.606 


1383 


L 23 


0.881 


0.644 


1384 


24 


0.978 


0.911 


1385 


21 


0.974 


0.723 


1386 


26 


0.980 


0.795 


1387 


16 


0.903 


0.654 


1388 


20 


0.912 


0.596 


! 1389 


19 


0.981 


0.960 


1390 


25 


0.932 


0.790 


1391 


15 


0.990 


0.963 


1395 


18 


0.942 


0.709 


1396 


28 


0.963 


0.844 


1397 


19 


0.972 


0.882 


1398 


21 


0.966 


0.827 


1399 


21 


0.962 


0.752 


1400 


25 


0.979 


0.855 


1402 


23 


0.913 


0.685 
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Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1403 


19 


0.935 


0.829 


1404 | 


21 


0.984 


0.958 


1405 


27 


0.888 


0.566 1 


1406 


36 


0.945 


0.564 | 


1407 


19 


0.938 


0.755 i 


1408 


22 


0.947 


0.745 ! 


1409 


16 


0.909 


0.728 


1410 


20 


0.961 


0.866 


1412 


22 


0.991 


0.926 


1413 


20 


0.911 


0.683 


! 1414 


15 


0.905 


0.737 


1416 


13 


0.933 


0.799 


1417 


46 


0.956 


0.728 I 


1418 


20 ! 


0.945 


0.782 | 


1419 


19 


0.987 


0.953 | 


1420 


30 


0.976 


0.862 


1421 


24 


0.964 


0.796 


1423 


23 


0.924 


0.645 


1425 


19 


0.913 


0.670 


1426 


33 


0.968 


0.774 


1427 


22 


0.941 


0.632 


1428 


18 


0.972 


0.935 


1429 


15 


0.978 


0.909 


1430 


26 


0.926 


0.713 


1431 


26 


0.915 


0.659 


1432 


21 


0.949 


0.790 


1433 


27 


0.996 


0.854 


1434 


26 


0.910 


0.590 


1436 


21 


0.983 


0.793 


1437 


18 


0.932 


0.643 


1438 


21 


0.908 


0.583 


1439 


24 


0.925 


0.742 


1440 


18 


0.909 


0.736 


1441 


30 


0.883 


0.615 


1442 


37 


0.960 


0.714 


1444 


30 


0.942 


0.586 j 


1445 


24 


0.904 


0.640 ! 


1446 


26 


0.950 


0.724 


1447 


15 


0.956 


0.757 | 


1448 


30 


0.906 


0.692 


1449 


21 


0.933 


0.751 


1450 


25 


0.990 


0.855 


1451 


20 


0.893 


0.775 


1452 


26 


0.952 


0.729 


1453 


44 


0.990 


0.654 


1454 


20 


0.974 


0.810 


1455 


21 


0.960 


0.679 


1456 


17 


0.926 


0.629 


1457 


23 


0.982 


0.940 


1458 


18 


0.986 


0.938 


1459 


22 


0.940 


0.617 


1460 


18 


0.939 


0.698 


1461 


39 


0.997 


0.955 


1462 


11 


0.989 


0.626 


1463 


16 


0.972 


0.911 
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Table 6 



SEQIDNO: 


Pncitinn nf Signal 
Pen tide 

JL W|IUUv 


Maximum score 


Average score 


1465 


17 


0.948 


0.855 


1466 


13 


0.901 


0.739 


1467 


20 


0.960 


0.883 


1468 


26 


0,903 


0.585 


1469 


18 


0.914 


0.710 


1470 


23 


0.972 


0.908 


1471 


19 


0.942 


0.626 


1473 


25 


0.972 


0.670 


1474 


15 


0.917 


0.810 


1475 


40 


0.923 


0.825 


1477 


21 


0.914 


0.589 


1478 


26 


0.964 


0.721 


1479 


19 


0.936 


0.624 


1481 


22 


0.995 


0.943 


1482 


20 


0.995 


0.959 


1484 


19 


0.964 


0.755 


1485 


15 


0.956 


0.847 


1486 


27 


0.963 


0.584 


1487 


23 


0.941 


0.781 


1488 


L 32 


0.969 


0.816 


1489 


29 


0.956 


0.742 


1491 


20 


0.894 


0.615 


1492 


34 


0.923 


0.668 


1493 


16 


0.943 


0.809 


1494 


19 


0.969 


0.878 


1495 


27 


0.944 


0.726 


1496 


45 


0.915 


0.688 


1497 


45 


0.908 


0.583 


1499 


45 


0.987 


0.820 


1500 


20 


0.972 


0.790 


1501 


14 


0.881 


0.637 


1503 


24 


0.973 


0.786 


1504 


16 


0.923 


0.752 


1505 


22 


0.965 


0.829 


1507 


43 


0.996 


0.907 


1509 


21 


0.948 


0.732 


1510 


23 


0.962 


0.822 


1511 


34 


0.921 


0.646 


1512 


19 


0.959 


0.753 


1513 


46 


0.962 


0.628 


1514 


21 


0.928 


0.717 


1515 


16 


0.926 


0.731 


1516 


15 


0.885 


0.663 


1517 


21 


0.935 


0.795 


1518 


21 


0.945 


0.852 


1519 


13 


0.881 


0.636 


1520 


20 


0.949 


0.704 


1521 


21 


0.938 


0.745 


1522 


20 


0.977 


0.923 


1523 


23 


0.925 


0.619 


1524 


20 


0.933 


0.728 


1525 


11 


0.912 


0.784 


1526 


29 


0.907 


0.656 


1527 


18 


0.962 


0.704 


1528 


42 


0.977 


0.817 
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Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1529 


37 


0.960 


0.623 


1530 


22 


0.899 


0.649 


1532 


22 


0.943 


0.663 


1533 


20 


0.970 


0.936 


1534 


28 


0.934 


0.607 


1535 


30 


0.989 


0.890 


1536 


16 


0.984 


0.932 


1537 


22 


0.992 


0.974 


1538 


35 


0.976 


0.622 


1539 


20 


0.901 


0.576 


1540 


28 


0.944 


0.697 


1542 


28 


0.936 


0.667 


1543 


25 


0.891 


0.550 


1544 


21 


0.967 


0.700 


1545 


31 


0.938 


0.649 


1546 


21 


0.883 


0.569 


1547 


29 


0.953 


0.614 


1548 


12 


0.916 


0.815 


1549 


23 


0.955 


0.658 


1550 


21 


0.948 


0.635 


1551 


19 


0.956 


0.835 ! 


1552 


18 


0.960 


0.803 : 


1554 


33 


0.920 


0.577 


1555 


24 


0.947 


0.717 


1556 


31 


0.898 


0.658 


1557 


24 


0.960 


0.876 


1558 


23 


0.985 


0.878 


1560 


38 


0.919 


0.553 


1561 


12 


0.942 


0.841 


1562 


21 


0.887 


0.568 


1563 


19 


0.990 


0.928 


1564 


18 


0.950 


0.814 


1567 


26 


0.970 


0.822 


1569 


14 . 


0.928 


0.806 


1570 


26 


0.998 


0.969 


1571 


18 


0.911 


0.762 


1572 


28 


0.986 


0.924 


1574 


15 


0.935 


0.815 


1575 


18 


0.955 


0.896 


1576 


26 


0.949 


0.697 


1577 


20 


0.945 


0.856 


1578 


24 


0.962 


0.723 


1579 


23 


0.976 


0.716 


1580 


20 


0.903 


0.597 


1582 


19 


0.880 


0.679 


i 1583 


25 


0.984 


0.918 


1584 


22 


0.991 


0.876 


1585 


23 


0.968 


0.710 


1586 


33 


0.894 


0.596 


1587 


23 


0.918 


0.721 


1588 


19 


0.913 


0.703 ! 


1589 




0.951 


0.886 


1590 


28 


0.887 


0.557 


1591 


26 


0.999 


0.969 


1592 


19 


0.968 


0.865 
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Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1593 


32 


0.962 


0.612 


1594 


22 


0.966 


0.864 


1596 


19 


0.970 


0.823 


1597 


15 


0.917 


0.825 


1598 


32 


0.991 


0.900 


1599 


26 


0.927 


0.693 


1600 


18 


0.896 


0.656 


1601 


16 


0.926 


0.833 


1602 


18 


0.948 


0.883 


1603 


18 


0.977 


0.868 


1604 


34 1 


0.943 


0.730 


1606 


15 


0.930 


0.640 


1607 


32 


0.967 


0.697 


1608 


21 


0.922 


0.658 


1610 


30 


0.881 


0.586 


1611 


30 


0.887 


0.667 


1612 


19 


0.938 


0.565 


1613 


22 


0.977 


0.894 


1614 


20 


0.925 


0.725 


1615 


25 


0.972 


0.746 


1616 


30 


0.986 


0.671 


1619 


18 


0.917 


0.620 


1620 


28 


0.968 


0.611 


1621 


29 


0.925 


0.613 


1622 


48 


0.968 


0.711 


1623 


24 


0.937 


0.586 


1624 


19 


0.914 


0.694 


1625 


26 


0.906 


0.685 


1626 


14 


0.962 


0.863 


1627 


28 


0.976 


0.911 


1629 


17 


0.973 


0.938 


1630 


22 


0.962 


0.919 


1632 


31 


0.997 


0.846 


1633 


25 


0.920 


0.607 


1634 


17 


0.982 


0.945 


L 1635 


17 


0.994 


0.968 


1638 


30 


0.922 


0.705 


1639 


21 


0.952 


0.714 


1640 


21 


0.966 


0.807 


1641 


23 


0.983 


0.821 


1642 


18 


0.953 


0.885 


1643 


16 


0.907 


0.647 


1644 


20 


0.884 


0.650 


1645 


17 


0.959 


0.680 | 


1646 


18 


0.991 


0.954 


1647 


30 


0.983 


0.786 


1648 


21 


0.886 


0.567 


1649 


24 


0.894 


0.658 


1650 


23 


0.881 


0.657 


1651 


27 


0.932 


0.702 


1652 


22 


0.993 


n ooc 

0.885 


1653 


17 


0.990 


0.926 


1654 


19 


0.932 


0.622 


1655 


34 


0.931 


0.673 


1656 


19 


0.966 


0.909 
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SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1657 


17 


0.955 


0.867 


1658 


38 


0.954 


0.594 


1659 


19 


0.920 


0.710 J 


1660 


37 


0.988 


0.598 


1662 


32 


0.909 


0.675 


1664 


16 


0.937 


0.804 


1665 


20 


0.911 


0.621 


1667 


29 


0.981 


0.871 


1668 


33 


0.972 


0.869 


1669 


22 


0.968 


0.913 


1670 


23 


0.990 


0.932 1 


1672 


22 


0.939 


0.716 


1673 


17 


0.963 


0.865 1 


j 1674 


38 


0.949 


0.669 


1675 


20 


0.926 


0.787 


1677 


19 


0.938 


0.785 1 


1678 


20 


0.929 


0.727 


1679 


20 


0.916 


0.604 


1680 


21 


0.967 


0.886 


1681 


20 


0.909 


0.749 


1682 


30 


0.928 


0.776 


1683 


20 


0.916 


0.649 


1684 


21 


0.976 


0.879 


1685 


13 


0.897 


0.645 


1686 


13 


0.994 


0.963 


1687 


17 


0.898 


0.743 


1688 


30 


0.946 


0.638 


1689 


21 


0.996 


0.976 


1690 


18 


0.916 


0.595 


1691 


17 


0.934 


0.754 


1692 


28 


0.899 


0.753 


1693 


20 


0.933 


0.655 


1694 


19 


0.990 


0.920 


1695 


17 


0.945 


0.731 


1697 


18 


0.885 


0.588 


1698 


29 


0.986 


0.937 


1699 


26 


0.972 


0.557 


1700 


17 


0.977 


0.946 


1701 


17 


0.882 


0.608 


, 1702 


20 


0.989 


0.952 


1703 


22 


0.919 


0.578 


1706 


31 


0.895 


0.648 


1707 


22 


0.965 


0.922 


1708 


22 


0.937 


0.569 


1709 


20 


0.980 


0.903 


1710 


17 


0.972 


0.857 


1711 


27 


0.984 


0.823 


1712 


17 


0.963 


0.872 


1713 


24 


0.977 


0.880 


1714 


17 


0.970 


0.908 


1715 


31 


n oil 


0.843 


1716 


18 


0.931 


i 0.703 


1717 


18 


0.931 


0.702 


1718 


34 


0.946 


0.628 


1719 


19 


0.973 


0.883 
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SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1720 


48 


0.980 


0.845 


1721 


28 


0.922 


0.676 


1722 


44 


0.965 


0.645 


1723 


26 


0.887 


0.730 


1724 


25 


0.939 


0.795 


1725 


15 


0.971 


0.942 1 


1727 


23 


0.923 


0.591 


1728 


23 


0.987 


0.936 


1729 


18 


0.927 


0.814 


! 1730 


18 


0.935 


0.605 J 


1731 


25 


0.972 


0.912 


I 1732 


42 


0.972 


0.726 


1733 


20 


0.952 


0.798 


1734 


17 


0.975 


0.918 


1735 


15 


0.979 


0.877 J 


1736 


41 


0.933 


0.659 


1738 


17 


0.925 


0.746 


1739 


18 


0.912 


0.764 


1741 


11 


0.953 


0.814 


1742 


23 


0.976 


0.774 


1744 


23 


0.918 


0.606 


1746 


29 


0.915 


0.652 


1747 


15 


0.933 


0.840 


1748 


27 


0.903 


0.612 


1750 


29 


0.904 


0.618 


1751 


22 


0.888 


0.670 


1752 


16 


0.979 


0.868 


1753 


26 


0.959 


0.884 1 


1754 


22 


0.954 


0.696 


1755 


20 


0.895 


0.707 


1756 


26 


0.906 


0.703 


1757 


14 


0.888 


0.587 


1758 


15 


0.994 


0.953 


1759 


21 


0.922 


0.610 


1760 


21 


0.942 


0.693 


1761 


19 


0.947 


0.814 


1762 


21 


0.934 


0.655 j 


1763 


22 


0.940 


0.609 


1764 


23 


0.937 


0.832 \ 


1765 


23 


0.896 


0.677 


1766 


26 


0.909 


0.690 


1768 


18 


0.915 


0.689 


1769 


36 


0.969 


0.602 


1770 


20 


0.880 


0.640 


1772 


20 


0.942 


0.715 


1773 


20 


0.947 


0.817 


1774 


16 


0.969 


0.880 


1775 


18 


0.971 


0.859 


1776 


24 


0.891 


0.670 


I 1777 


27 


0.961 


] 0.747 


1778 


40 


0.963 


0 574 


1779 


I 23 


0.974 


0.656 


1780 


21 


0.899 


0.653 


I 1781 


25 


0.908 


0.601 


! 1782 


19 


0.943 


0.678 
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Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1783 


23 


0.936 


0.634 


1784 


29 


0.949 


0.786 


1785 


44 


0.915 


0.571 


1786 


22 


0.965 


0.885 


1787 


15 


0.974 


0.940 


1789 


23 


0.952 


0.659 


1790 


16 


0.972 


0.898 


1791 


21 


0.980 


0.953 


1792 


32 


0.961 


0.668 


1793 


29 


0.907 


0.551 


1794 


22 


0.957 


0.934 


1795 


21 


0.990 


0.849 


1796 


22 


0.954 


0.893 


1797 


16 


0.942 


0.657 


1799 


25 


0.949 


0.840 


1800 


28 


0.949 


0.739 1 


1801 


25 


0.938 


0.767 


1802 


15 


0.899 


0.672 


1803 


17 


0.987 


0.956 


1804 


24 


0.941 


0.775 


1805 


26 


0.972 


0.771 


1806 


20 


0.985 


0.957 


1807 


22 


0.932 


0.571 1 


1808 


16 


0.927 


0.608 


I 1809 


26 


0.987 


0.770 


i 1810 


37 


0.955 


0.592 


1811 


28 


0.911 


0.632 


1812 


24 


0.894 


0.698 ! 


1813 


22 


0.906 


0.624 


1814 


34 


0.951 


0.806 


1816 


25 


0.919 


0.578 


1817 


26 


0.980 


0.932 


1818 


19 


0.993 


0.940 


1820 


26 


0.939 


0.810 


1821 


48 


0.967 


0.556 


1822 


19 


0.931 


0.753 


1823 


36 


0.892 


0.670 


1824 


18 


0.903 


0.674 


1825 


17 


0.966 


0.854 


1826 


15 


0.938 


0.849 


1827 


27 


0.985 


0.891 


1828 


17 


0.895 


0.665 


1829 


36 


0.916 


0.620 


1830 


22 


0.952 


0.835 


1831 


17 


0.961 


0.731 


1832 


19 


0.996 


0.982 


1833 


19 


0.918 


0.556 


1834 


37 


0.926 


0.587 


1836 


14 


0.897 


0.787 


1837 


19 


0.960 


0.816 


1838 


31 


0.902 


0.632 


1839 


17 


0.987 


f 0.955 


i 1840 


i 23 


0.988 


0.941 


1842 


! 26 


0.915 


0.695 


1843 


26 


0.987 


0.926 
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Table 6 



SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1904 


25 


0.949 


0.844 


1905 


23 


0.945 


0.718 


1906 


18 


0.907 


0.556 


1907 


20 


0.961 


0.786 


1908 


19 


0.907 


0.752 


1909 


17 


0.957 


0.808 


1910 


22 


0.933 


0.778 


1911 


22 


0.988 


0.913 


1912 


32 


0.964 


0.814 


1913 


21 


0.952 


0.784 


1914 


24 


0.946 


0.644 


1915 


21 


0.919 


0.644 


1916 


21 


0.969 


0.912 


1917 


16 


0.962 


0.681 


1918 


14 


0.926 


0.776 


1919 


23 


0.987 


0.897 


1920 


48 


0.987 


0.614 


1921 


23 


0.899 


0.677 


1922 


23 


0.907 


0.651 


1923 


16 


0.921 


0.706 


1924 


20 


0.928 


0.672 


1925 


26 


0.985 


0.942 


1926 


27 


0.911 


0.682 


1927 


19 


0.939 


0.700 


1928 


15 


0.887 


0.709 


1929 


15 


0.980 


0.959 


1930 


25 


0.987 


0.924 


1931 


28 


0.936 


0.745 


1932 


20 


0.958 


0.669 


1933 


21 


0.988 


0.945 


1934 


24 


0.912 


0.699 


1935 


23 


0.909 


0.726 


1936 


20 


0.964 


0.924 


1937 


28 


0.960 


0.813 


1938 


18 


0.971 


0.806 


1939 


20 


0.954 


0.746 


1941 


20 


0.986 


0.933 


1942 


45 


0.976 


0.736 


1944 


18 


0.967 


0.871 


1945 


20 


0.973 


0.759 


1947 


17 


0.954 


0.919 


1948 


21 


0.970 


0.871 


1949 


18 


0.991 


0.976 


1950 


27 


0.893 


0.647 


1951 


19 


0.881 


0.705 


1952 


24 


0.977 


0.830 


1953 


15 


0.957 


0.834 


1954 


29 


0.970 


0.863 


1956 


19 


0.940 


0.835 


1957 


32 


0.992 


0.891 


1958 


22 


0.968 


0.837 


1959 


27 


0.908 


0.725 


1960 


20 


0.941 


0.751 


1961 


21 


0.885 


0.669 


1962 


29 


0.955 


0.797 
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SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


1963 


16 


0.974 


0.950 


1964 


21 


! 0.929 


0.745 


1965 


24 


0.913 


0.658 


1966 


45 


! 0.937 


0.671 


1968 


43 


0.956 


0.581 


1969 


19 


0.956 


0.614 


1970 


46 


0.901 


0.566 


1971 


24 


0.947 


1 0.768 


1972 


1 24 


0.900 


0.642 


1974 


22 


0.988 


0.922 


1975 


24 


0.951 


0.710 


1976 


18 


0.932 


0.740 


_ 1977 


18 


0.954 


0.736 


1978 


20 


0.994 


0.967 


1979 


26 


0.987 


0.926 


1980 


22 


0.964 


0.866 


1981 


13 


0.932 


0.870 


1982 


21 


0.949 


0.881 


1983 


23 


0.957 


0.658 


1984 


12 


0.954 


0.910 


1985 


22 


0.990 


0.829 


1986 


31 


0.987 


0.845 i 


1987 


20 


0.919 


0.721 


1988 


17 


0.985 


0.966 


1989 


24 


0.966 


0.830 


1990 


31 


0.971 


0.816 


1991 


15 


0.935 


0.823 


1992 


21 


0.967 


0.802 


1994 


18 


0.930 


0.650 


1995 


20 


0.902 


0.611 


1996 


23 


0.946 


0.724 


1997 


25 


0.943 


0.787 


1998 


18 


0.921 


0.666 


1999 


13 


0.883 ~T 


0.748 


2000 


24 


0.899 


0.579 


2001 


13 


0.918 1 


0.705 


2002 


18 


0.899 


0.809 


2003 


18 


0.950 


0.647 


2004 


30 


0.981 


0.889 


2005 | 


17 


0.950 


0.771 


2007 


24 


0.940 


0.800 


2008 


21 


0.980 


0.815 


2009 


43 


0.939 


0.655 


2010 


16 


0.920 


0.698 


2011 


30 


0.978 


0.901 


2012 


19 


0.981 


0.919 | 


1 2013 


40 


0.978 


0.553 


2014 


20 


0.994 


0.960 


2015 


18 


0.955 


0.771 


2016 


25 


0.914 


0.769 


Zul / 


31 


0.952 I 


0.776 


2018 


26 


0.985 


0.854 


2019 


16 


0.945 


0.822 


2020 


22 


0.973 


0.804 


2021 


17 


0.954 


0.919 
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SEQ ID NO: 


Position of Signal 
Peptide 


Maximum score 


Average score 


2022 


19 


0.993 


0.973 


2023 


18 


0.921 


0.683 


2026 


23 


0.890 


0.604 


2027 


35 


0.943 


0.603 


2028 


25 


0.992 


0.953 


2029 


47 


0.950 


0.846 


2030 


17 


0.914 


0.722 


2032 


18 


0.995 


0.974 


2033 


17 


0.933 


0.828 


2034 


17 


0.934 


0.644 


2035 


26 


0.910 


0.567 


2036 


30 


0.940 


0.690 


2037 


23 


0.908 


0.557 


2038 


18 


0.906 


0.624 


2039 


18 


0.926 


0.768 


2040 


14 


0.934 


0.758 


2041 


L 18 


0.960 


0.869 


2042 


21 


0.911 


0.716 


2043 


25 


0.896 


0.576 


2044 


27 


0.953 


0.850 


2045 


17 


0.962 


0.863 


2046 


25 


0.924 


0.572 


2047 


39 


0.955 


0.608 


2048 


38 


0.958 


0.692 


2049 


25 


0.949 


0.803 


2050 


27 


0.932 


0.726 


2051 


15 


0.900 


0.672 


2052 


22 


0.967 


0.703 


2053 


19 


0.960 


0.757 


2054 


20 


0.880 


0.775 


2055 


19 


0.913 


0.721 


2057 


23 


0.955 


0.882 


2058 


23 


0.893 


0.728 


2059 


26 


0.953 


0.619 


2060 


19 


0.935 


0.770 


2061 


44 


0.952 


0.739 


2062 


31 


0.964 


0.894 


2063 


19 


0.924 


0.707 


2064 " 


18 


0.891 


0.673 


2065 


25 


0.912 


0.764 


2067 


25 


0.954 


0.812 


2068 


20 


0.913 


0.685 


2069 


40 


0.974 


0.686 


2070 


28 


0.991 


0.896 


2072 


18 


0.956 


0.844 


2073 


26 


0.928 


0.741 


2074 


17 


0.902 


0.678 


2075 


18 


0.965 


0.850 


2076 


il 


0.975 


0.937 


2077 


32 


0.988 


0.863 


2078 


29 


0.922 


0.662 


2080 


20 


0.986 


0.918 


2081 


13 


0.969 


0.953 
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SEQ ED NO: 



1 



Chromsomal location 



_2_ 
3 



15 



12 



15 



10 



13 



11 



12 



13 



10 



14 



15 



16 



10 



12q 



17 



19 



20 



21 



23 



24 



12 



11 



25 



26 



27 



16 



28 



29 



30 



31 



11 



32 



33 



34 



17 



36 



38 



39 



X 



16 



41 



42 



19 



43 



44 



45 



46 



19 



18 



47 



48 



49 



10 



52 



53 



11 



18 



54 



17 



55 



17 



56 



57 
59 



60 



21 



10 



61 



18 



63 



64 



65 



11 



20qll.21-11.23. 
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Table 7 



SEQIDNO: 


Chromsomal location 


66 


15 


68 


! 11 


70 


_ 14 


71 


9 


72 


11 


75 


1 


77 


L 2 


78 


3 


79 


i 7 


80 


3 


81 


1 


82 


13 


83 


6pl 1.2-12.3 


84 


1 


85 


4 


86 


5 


87 


12 


88 


6 


90 


2 


92 


6 


95 


L 15 


96 


10 


97 


4 


98 


14q31 


99 


1 


100 


5 


101 


2 


102 


4 


103 


4 


104 


19 


105 


11 


107 


3 


109 


10 


111 


X 


l_ 114 


X i 


115 


2 


116 T 


1 


117 


5 


118 1 


9 


120 


2 


121 


19 


123 


2 


124 


10 


125 


5 


l_ 126 


X 


128 


1 


130 


3 


L 131 


17 


L 135 


9 


136 


16 


137 


17 


138 f 


2 


139 


2 


140 


6ql6.1-16.3. 


142 


9 


143 


20 
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SEQ ID NO: 


Chromsomal location 


145 


8 


r 146 


22ql3. 


147 


1 


148 


I 6 


149 


16 


151 


6 1 


152 


6 1 


153 


j 2 J 


155 


I 4 


156 


17 J 


157 


— 17 H 


158 


11 i 


159 


11 i 


160 


16 | 


161 


— 1 


162 


17 I 


163 


1 J 


164 


5 


165 


15 


166 


3 n 


168 


9 1 


169 


6 


170 




171 


u 1 


172 


4 


174 


10 J 


175 


8 _J 


176 


6 


177 n 


15 


178 ! 


6 


179 


9 | 


180 


9. | 


181 


2 ~~] 


182 


6 


183 


2 1 


185 


11 i 


! 186 


11 1 


i 188 


18 


189 


11 J 


190 


9 I 


L 191 


10 


192 


4 1 


193 


Xql3.2-21.1 I 


194 


10 


196 


20 j 


197 


10 | 


198 


6 j 


199 


11 n 


201 r 


11 J 


I 203 


X 


206 


8 


OAT 

207 


11 □ 


208 


19 ! 


209 


15 i 


210 


3q 


211 


6q25.1-26 
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SEQ ED NO: 


Chromsomal location 


212 


9 


214 


19 


215 


20 


217 


1 


L 218 


22ql3.31-13.33 


219 


1 


220 


2 


221 


3 


222 


9 


r 223 


15 


225 


3p 


226 


18 


228 


4 


229 


! 17 


230 


17 


231 


1 1 


232 


19 


234 


11 


235 


19 H 


238 


3 


239 


6 


241 


11 


242 


10 


! 243 


f 15 


244 


4 


245 


21 


246 


19 


248 


6p 12.3-2 1.2 


249 


3 


250 


1 


251 


20 


252 


16q24.3 


253 


19 


254 


14 


255 


9 


257 


2 


258 


11 


259 


17 


260 


19 


261 


8 


262 


3 


263 


8 


264 


16 


| 265 


9q34.2-34.3 


266 


10 


267 


17 


268 


4 1 


269 


3p 


270 


9q 13-21.33 


271 


1 


272 


8 


273 


19 


275 


17 


279 


3q 


280 


15 


281 


6 | 
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Chrnmsftmal location 


282 


17 


2X1 


17 


285 


15 


286 


5 


289 


10 


290 


9 


297 


7 


291 


8 


994 


18 


996 


4 




15 


9QR 


15 


900 


10 1 


juU 


7 


101 


5 


10? 


11 


104 

JUT 




105 


Xa25-26 2 


106 


18 


107 

-7\/ / 


2 


308 


17 


109 


1 


310 


12 


11 1 


20 


111 


18 


114 


11 


11 5 


14 


116 


6 


117 


10 


118 

J AO 


10 


119 


19 


190 


Q 


191 




199 


10 


191 




194 


10 


195 


1 

1 


126 


16 


127 
«7^» / 


6 


128 


x 


330 


4 


331 


2 


112 


14 


313 
j j j 


2 


114 


2 


116 

.7.7 U 


21a22 3 

X *l|f ffr i i * 


117 


9 


118 


19 


119 


1S 


1 140 


4 


341 


9 


342 


10 


343 


19 


344 


| 5 


346 


I 16 


349 


3 
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Table 7 



SEQ ID NO: 


Chromsomal location 


350 


11 


352 


17 


353 


18 


354 


20 


356 


3 


357 


5 


358 


11 


359 


9 


364 


2 


365 


4 


366 


7 


367 


5 


369 


8 


370 


4 


371 


6ql5-16.1 


372 


19 


374 


2 


375 


12 


376 


17 


377 


1 


379 


19 


380 


9 


381 


6. 


382 


9 


383 




384 


18 


385 


3 


387 


1 


388 


21 


389 


17 


390 


17 


391 


4 


393 


10 


394 


11 


395 


11 


396 


10 


397 


16 


398 


13 | 


400 


3 


402 


2 


403 


Xq28 


406 


1 


407 


19 


408 


8 


409 


4 


410 


3 f 


411 


4 ! 


412 


5 


413 


22ql2.3-13.1 


414 


8 


416 


8 


417 


20pl2.2-13 j 


418 


10 


420 


4 


421 


8 


423 


11 
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SEQ ID NO: 


Chromsomal location 


424 


17 


425 


17 


426 


17 


427 


17 


428 


4 


1 429 


2 


430 


3 


431 


19 


432 


18 


433 


12 


434 


17 


435 


6 


436 


2 


438 


1 


439 


o 


441 


1 


442 


2 


443 


11 


444 


2 


446 


11 


447 


19 


i 448 


11 


449 


19 


450 


3 


452 


3 


453 


5 


455 


17 


457 


6 


459 


18 


460 


18 


461 


14 


462 


5 


463 


11 


464 


3 


465 


7 


466 


11 


467 


13 


470 


19 


471 


6d24 1-25 3 


473 


4 


474 


15 


475 


13 


478 


8 | 


479 


10 


480 


15 


481 


9 


482 


ln23 1.24 1 


483 


fi i 
o 


484 


17 


486 


15 


487 


22qll 


488 


3q 


489 


1 


490 


3 


492 


11 


493 


lp36.2-36.3 
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oUrl/ ID INU: 


v^nr onisomai location 


4iO 


10 


4yo 


10 


49/ 


1 R 


AOS2 




499 


c 


5U1 


O 


5U3 


i 
1 


5U4 


1 ft 
1U 


505 


ZU 


506 


3 


CAT 

507 


1 q 
18 


C AO 

508 


o 
8 


CAA 

509 


1 


510 


2 


513 


oq25.2-20 


C 1 A 

514 


o 


517 


■J 
3 


CIO 

518 


5 


519 


1 o 
12 


<nn 

520 


13 


CO 1 

521 


12 


D22 


1 c 
ID 


5/3 


i c 
ID 


CO/I 

524 


Q 
O 


D2D 


ID 


DZO 


1 c 
ID 


52o 


A 
4 


53U 


O 

o 


c^ i 
5il 


1 1 


532 


A 

4 


533 


1 *7 
1 / 


534 


3 


f 1 c 

535 


1 D 

18 


53o 


1 o 

18 


537 


1 c 

15 


CIO 

538 


13 


539 


Q 
O 


</l A 
54U 


A 


542 


2 


543 


5 


J 44 


AQ2D. 


C/l£ 
D4o 


i i 
1 1 


54/ 


22Ql5.2-l3.33. 


549 


13Q12-13 


CCA 

55U 


i 
1 


ceo 
552 


oq23 


e ei 
553 


19 


CC/1 

554 


1 
1 


ccc 
DD5 


1 7 


D5o 


n 
1 


J JO 




559 


8 


560 


12 


561 


10 


563 


19 I 


564 


10 



WO 03/080795 PCT/US02/25485 

434 



Table 7 



SEQ ED NO: 


Chromsomal location 


565 


17 


566 


i. 9 


567 


1 


568 


1 Xq22.2-24 


569 


1 3 


570 


! l 


571 


5 


573 


6q22.1-22.33 


574 


15 


575 


17 


576 


5 


577 


5 


578 


11 


581 


22ql2 


582 


16 


584 


6q25.3-26 


585 


3 


586 


11 


587 


2 1 


588 


2 


589 


15 


590 


11 


591 


11 


593 


Xpl 1.3-21.1 


594 


22 


595 


9 


596 


11 


597 


10 


598 


11 


599 


12 


601 


9 


602 


16 


603 


12 


604 


8 


605 


6 


606 


11 


607 


10 


608 


1 1 


609 


3 


610 


5 


611 ! 


3 


612 


6 


613 


10 


614 


17 


615 


11 


616 


6 I 


617 


16 


618 


11 


620 


18 


i 621 


17 


622 


17 


624 


22 


625 


3 


626 


19 


627 


11 


629 


3 
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Table 7 





V^llTUlUaUlIlal lUlallUIl 


610 




611 


17 
i / 


619 


£ 
u 


614 


7 


61 <5 


10 


616 


1? 


£17 

Oj / 


O 




Q 
O 


Oh\) 


J 


£A1 
041 


1 1 
1 1 


£/19 
OH A 


A 


OH J 


7 

/ 


(LA A 

OHH 


90r\19 1 11 
ZUp 1Z. 1-1 j. 


(LA(L 

oho 


1 c 
1 J 


(LAn 
OH 1 


z 


£48 
OHO 


16 

ID 


OHzf 


Q 
O 


£50 


4 


651 


1lal9 1 1-19 9 


£59 
OJZ 


10 


£54. 


1 


£55 

O^D 




£56 

OjO 


J 


£57 
OD 1 


11 


ojy 


i 


££ft 

OOU 


18 


££1 

ODl 


99 


££9 
ooz 




££i 

003 




OOH 


1 8 
10 


OOD 


4 


OOO 


4 

H 


££7 

00/ 


C 
J 


0/1 


1 1 
1 1 


£79 
O/Z 


18 
lo 


£74 
0 /*f 


1Q 


£7 5 

O/D | 


17 
1 / 


£7£ 

o to 


17 


fill 

Oil 


10 


£78 


10 


£7Q 


4 


£80 


Q 
o 


£81 

OO 1 


5 


£87 
ooz 


4 


£81 


o 


684 


1 

1 


£8£ 
OOO 


1 1 

1 1 


£87 
00/ 


<: 
J 


£8Q 
oo^ 


Q 


£00 
o^u 


A 
H 


691 


4 


692 


5 


693 


1 


694 


16 


695 


19 


696 


12 
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Table 7 



SEQ ID NO: 


Chromsomal location 


697 


11 


698 


11 


699 


10 


702 


5 


704 


16 


705 


3 


707 


3 


708 


IOdII 21-12 1 


1 709 


11 


710 


10 


711 


10 


712 


10 


714 


3 


715 


6a25 3-26 


716 




718 


x 


719 


17 

X f 


721 




722 


16 


723 


2 


724 


12 


725 


| 16 


726 


19 


727 


3 


728 


16 


729 


6 


730 


16 


731 


7 


732 


11 


733 


g 


734 


9a21 11-21 2 


735 


17 


736 


5 


737 


1 


738 


1 


739 


1 


740 


Xa22 3-24 


741 


17 

X 1 


743 


7 


744 


15 


! 746 


12 


1 747 


1 


748 


19 


749 


5 


750 


9 


751 


5 


752 


9 


753 


19 


754 


15 


755 


g 


756 


X 


757 


3 


758 


lpl2-13.3 


760 


6 


761 


19 


762 


8 
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CFO m NO* 

OXl/V^f JUL/ nU« 




761 


1? 


764 


9 
z 


765 


1 1 
1 1 


766 


1 1 
1 1 


767 


1 5 


768 
/ uo 


17 i 


769 


1 1 
1 1 


771 
/ 1 1 


1 1 
1 1 


772 


17 


ITS 


5 


lid 


18 
xo 


775 


1 
x 


111 


Q 
O 


778 
/ / o 


16 


781 

/ O 1 


16 

AO 


78? 


1 
1 


781 


91 

Z 1 


784 


6n?1 9-9? 1 


785 




787 


16 


788 


7 


78Q 


1 K 
I J 


700 


99 
zz 


701 


£ 


70? 
/ 2/Z 


1 
X 


701 


99 
zz 


704 


0 
0 


705 


9 
z 


706 


1 

1 


700 




800 

Ovv 


Q 


802 


Q 


801 


17 


804 


10 

X v 


805 


.3 


806 


9 
z 


807 


14 


810 


6 


811 


10 

XV/ 


812 


16 


813 


1 


815 

Q X «S 


16 


817 


1 


818 

OlO 


X J 


810 


v n 99 194 


891 

OZ 1 


1 

X 


899 
ozz 


6n 1 6 1-91 
04 x 0. x "Z x . 


ozo 


17 
X / 


89 5 
OZ J 


10 


896 


1 5 

X J 


827 


3 


828 


17 


829 


22ql3.33. 


L 830 


11 


832 


15 


l_ 833 


9q31.3-33.2 ! 
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Table 7 



seo m NO* 


i^nruiiLauiiJdi lULduon 


834 


1 K 
1 J 


835 


V 

-A. 


836 


1 1 


837 


19 




10 


839 


9 
z 


840 


1 
1 


i 841 


o 
o 


84? 


/! 
*t 


841 


1 
1 


84 S 


1 (L 

10 


848 


10 

iy 


840 




8S1 

OJ 1 


9 
z 


8^1 

OJJ 


i n 

1U 


8S6 


9 

z 


8S7 


i 
i 


8S8 

O JO 


c 

J 


8S9 


9 
z 


860 


10 


861 


J 


862 


9 
z 


863 


1 1 
1 1 


864 




865 




866 

OOU 


91 

Zl 


867 


1 n49 1 1 -49 1 
1 q*rZ. 1 1 -HZ.. 3 


868 
ouo 


1 
1 


870 


0 

o 


871 

O / 1 


0 


87? 
o / z 


i 
X 


871 


1 9 
1Z 


874 

O /*T 


OqZ / 


876 


1 1 
1 1 


877 
o / / 


9 
Z 


' 878 


1 Q 


880 




881 


1 ! 


88*5 


0 

o 


886 


0 
y 


887 




888 


Q 

y 


891 


16 


892 


10 


893 

0-7 J 


91 
Zl 


894 


•> 


895 


D 


896 


A 


897 


1 7 


898 

070 


1 a 

10 


899 


10 


900 


16 


901 


3 


902 


11 


903 


1. 


904 


13 
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Table 7 



SEQ ID NO: 


Chromsomal location 


905 


19 


907 


10 


908 


5 


909 


1 


911 


1 


912 


5 


913 


16 


914 


1 


915 


8 


916 


11 


917 


17 


918 


16 


919 


19 


920 


7 


922 


9 


924 


10 


925 


11 i 


926 


11 


928 


1 


929 


1 


930 


12q 


931 


18 


932 


15 


933 


15 


934 


15 


935 


lp35.2-36.13. 


937 


11 


938 


1 


939 


15 


940 


X 


942 


11 


943 


1 


944 


9 


946 


5 


947 


4 


949 


12 | 


951 


4 


952 


10 


953 


11 


956 


6 


957 


19 


959 


16 


960 


6 


1 962 


16q24.3 


963 


9 


964 


6 


965 


Xql2 


966 


11 


967 


15 


969 


17 


970 


10 


972 


10 


973 


Xql2 


974 


lp36. 11-36.33 


976 


2 ! 


977 


20 
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Table 7 



SEQ ID NO: 


Chromsomal location 


979 


2 


980 


8 


981 


19 


984 


6 


985 


5 


987 


18 


988 


3 


989 


11 


990 


3 


991 


2 


992 


17 


993 


10 


994 


12 


995 


1d34.1-36 11 


996 


14 


997 


20pl2.2-13 


998 


2 


1000 


12 


1001 


1 


1002 


X 


1005 


17 


1006 


lp3 1.2-32.1 


1007 


15 


1008 


15 


1009 


2 


1010 


13 


1011 


6 


1012 


18 


1013 


1 


1015 


6 


1016 


5 


1017 


12 


1018 


5 


1019 


CITB-H1 2291F22 


1020 


4 


1021 


18 


1022 


1 


1023 


11 


1024 


1 


1025 


3 


1027 


19 


1028 


2 


1030 


3 


1031 


4 


1032 


1 


1033 


3p 


1034 


x ! 


1035 


l 


1036 


i 


1038 


13 


1041 


3 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 

UUUcUUUc 

location of 
first amino 

flptrl rpcirliiA 

a. LIU I Colli UC 

of peptide 
sequence 


Predicted 
ending 
nucicuiiuc 
location of 
last amino 

opiH rpciHup 
dull icoiuuc 

of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
aeieuon,=possiDie nucleotide insertion) 


2535 


C 


328 


546 


MMRRPVHCATDKEGDLAPKHFQAAAGEA 
RTSTDRSGAQAQRSVTPCQWHSVQDSSTY 
SSWWWAAAAETL ! 


2536 


A 


163 


699 


PADAPSLAAFPGDPQYDPPYCYPGTQCWV 

rubuMLLI^QTLCLGEQVLLGAWLVwGPS 

RDPRPLPYLCHDEPYTFDINLSVNLKGPGN 

RLGEPIPISKAHEHIFGMVLMNDWSGNYW 

SSWVKMTGKELGTWGNFIKAEDWCRSK 

GAVMAI^RAVTPTRAINESTIGAAGVDNE 

VSSTG 


2537 


A 


1415 


3050 


NHKSPMALPYHIFLFTVLLPSFTLTAPPPCR 

CMTSSSPYQEFLWRMQRPGNIDAPSYRSLS 

KGTPTFTAHTHMPRNCYHSATLCMHANT 

HYWTGKMINPSCPGGLGVTVCRTYFTQTG 

MSDGGGVQDQAREKHVKEAISQLTRGHST 

PSPYKGLVI^KLHETLRTHTRLVSLFNTTL 

TGLHEVSAQNPTNCWICLPLNFRPYVSEPV 

PEQWNNFSTEINTTSVLVGPLVSNLEITHTS 

NLTCVKFSNTTYTTNSQCIRWVTPPTQIVC 

LPSGIFFVCGTSAYRCLNGSSESMCFLSFLV 

PPMTIYTEQDLYSYWS*SPRNKRVPELPFVI 

GAGVLGGLGTGIGGITTSTQFYHKLSQELN 

GDMEQVA\DS\LVTLQDQLNSLAAWLQN 

BOAT T\T T T 1 A CB /"""♦/""IV'T t t r | m?/ w *oxnn tvt/™vci 

KJKALJJLJL 1 AtsKCjCj I CLLLGEECCYYVNQS 

U1V 1 JlKVJ^IKJLJKl^ 

SQWMPWBLPFLGPLAAIILLLLFGPCIFNLL 

VNFVSSRIEAVXLQMEPKMQSKTKIYRRPL 

DRPASPRSDVNDKGTPPEEISAAQPLLRPN 

SAGSS 


2538 


B 


67 




Ail/KVr 1 YrxlMlr Itul loT j 


2539 


A 


393 


l 


GGIGRGGGAGGGVGAAGSASGGVGRRGA 
GGVIADSGAPGGGVEGGVGASGGWRE/GR 
GTSGGVGGSGGACGSV/GGSGGAGGGVG 
AtAjro 1 oLHj V (jKoKIj 1 KjvjLCjCjovjoACjGG V 
GACGGASGYVGIRGAGGG 


2540 


A 


2 


370 


ARDPLLEQVELPAVASVSASVIKSPSDPSH 

VQVPPPPT T T PA ATTTJQXTCTCAifXJCCTDOrDXTV' 
V o V rr rr L,L,L,ri\J\ 1 1 KoiN o 1 oMrlooir olbiNK 

PPQATVKPQILTHVIEGFVIQEGLEPFPVSRS 
SLLffiQPVKKRPLLDNQVINSVCVOPEL 


2541 


A 


50 


247 


MWSAHPLAVLSLKLTLFSLTSDWLSSKDM 
AISLAFKISQILCSVLSAPGKPJUSVLWNTSS 
LKRS* 


2542 


A 


130 


3995 


HPDDIHTrLLAAGFLGLRTVGVTKAWRSG 

WLRFPAAMFLYNLTLQRATGISFAIHGNFS 

GTKQQEIWSRGKIL\ELLRPDPNTGKVHTL 

LTVEVFGVIRSLMAFRLTGGTKDYTVVGSD 

SGPJVH^YQPSKNMFEKIHQETFGKXSGGR 

SIVPGQFLAVDPKGRAVMISAIEKQKLVYI 

LNRDAAARLTISSPLEAHKANTLVYHVVG 

VDVGFENPMFACLEMDYEEADNDPTGEA 

AANTQQTLTFYELDLGLNHWRKYSEPLE 
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Table 8 



SEQ 
ID 

INO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=FStop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 


i 








EHGNFLITVPGGSDGPSGVLICSENYITYKN 

FGDQPDIRCPIPRRRNDLDDPERGMIFVCSA 

THKTKSMFFFLAQTEQGDIFKITLETDEDM 

VTEIRLKYFDTVPVAAAMCVLKTGFLFVA 

SEFGNHYLYQIAHLGDDDEEPEFSSAMPLE 

EGDTFFFQPRPLKNLVLVDELDSLSPILFCQ 

IADLANEDTPQLYVACGRGPRSSLRVLRH 

GLEVSEMAVSELPGNPNAVWTVRRHIEDE 

FDAYIIVSFVNATLVLSIGETVEEVTDSGFL 

GTTPTLSCSLLGDDALVQVYPDGIRHIRAD 

KRVNEWKTPGKKTIVKCAVNQRQWIALT 

GGELVYFEMDPSGQLNEYTERKEMSADV 

VCMSLANVPPGEQRSRFLAVGLVDNTVPJI 

SLDPSDCLQPLSM\QA\LPAQPES\LCIVEMG 

\GT*KQDELGERGSIGFLYLNIGLQNGVLLR 

TVLDPVTGDLSDTRTR\YLGSRPVKLFRVR 

MQGQEAVLAMSSRSWLSYSYQSPJAHLTP 

LSYETLEFASGFASEQCPEGIVAISTNTLRIL 

ALEKLGAVFNQVAFPLQ\YTPRK\FVIHPES 

NNLEIETDHNAYTEATK\A\QRKQQMAEE 

MVEAAWEDERDL\AAEMAAAF\LNENLPE 

SIFGAPKAGNGQLASVRRVMNPIQGEHTW 

TLSSLEQN\RAAF\SVAVCRFSNTGDDWYV 

LVGVPKDLILNPRSVAGGFVYTYKLVNNG 

EKLEFLHKTPVEEVPAAIAPFQGRVLIGVG 

KLLR\VY\DLGKEGSYFRKC*ELRHIANYI\S 

GDPDYSGHRVIVSDVQEKFHPGFRYKRKJL 

KIXLEFADDTVYPYRWYHYRPASWDYDTV 

GWGQDKFRPTYVWVPJLPTLTPIDEVR/DE 

DPTGNKSPVGTRGLAQMGGLPRKAEVIIEL 

THVG\ET\VLSLQKTT\LIPGRLQNSLVLLPP 

CFGGIG\ILVPRTSHE\DH\DFFQH\VE\MHLR 

\SEHPP\LCGGGDHL\SFRS\YYFPCEGM*LM 

ODLC£\QFNSM\EPNKQKERLLKELGPEPPP 

RSVPRKFEGYSGTRYGF 


2543 


A 


68 


425 


SHILPGAPGAPAWWTRWPSTLPEPFPRGRG 
SPAGTSPISRPGLVQSS*ASRGSDSRLPV/GP 
ASCQASGPGPDSRRPPPCTPAVGPHHGSLPS 
AGRVGASAAAAGPPSPAVPLPPAERPAP 


2544 


A 


1 


1982 


DAERQEALGIVRRIGTDTEAATEPAGATVP 
AAAAAARIGTVGPQPPAMPRRKRNAGSSS 
DGTEDSDFSTDLEHTDSSESDGTSRRSARV 
TRSSARLSQSSQDSSPVRNLQSFGTEEP\AY 
STRRVTRSQQQPTPVTPKKYPLRQTRSSGS 
ETEQWDFSDRETKNTADHDESPPRTPTGN 

ArooJioiJli/loox'IN V olllJliolAJsX'Mol^JSJJoVj 

SDLSHXRPKRRPJFHESYNFNMKCPTPGCNS 

LGHLTGKHERHFSISGCPLYHNLS\ADECK 

VRAQ\TRDKQIEERMLS\HRQDDNNRH\AT 

RHQAPTERQLRYKEKVAELKKKRNSGLSK 

EQKEKYMEHRQTYGNTREPLLENLTSEYD 
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Table 8 



SEQ 

ID 

NO* 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

IIUUCUUUC 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










LDLFRRAQARASEDLEKLRLQGQITEGSN 

MDCTIAFGRYELDTWYHSPYPEEYARLGRL 

YMCEFCLKYMKSQTILRRHMAKCVWKHP 

PGDEIYRKGSISVFEVDGKKNKTYCQNLCL 

LAKLFLDHKTLYYDVEPFLFYVMTEADNT 

uCJhLLICiYFSKEKNSFLNYNVSCILTlVff 

RQGYGKMLIDFSYLLSKVEEKVGSPERPLS 

DLGLISYRSYWKEVLLRYLHNFQGKEISIK 

EISQETAVNPVDIVSTLQALQMLKYWKGK 

HLVLKRQDLEDEWIAKEAKRSNSNKTMDP 

SCLKWTPPKGT 


2545 


A 


95 


719 


VWPEVTDPEKFVYEDVAIAAYLLILWEEE 

HPGRGIDVRRRKIWDMYGPQTQLEEDAITP 
NDKTLFPDVDWLIGNHSDELTPW1PVIAAR 

QQVM^PTTPVT POi^TTTTT^TTTriP VCD "D nCFVTH 

oo I ANv^ivrr y JLJrL/OrrlJrlOK. i oinJvv^oJvivI. 

YREYLDFIKEVGFTCGFHVDEDCLRIPSTTR 

VCLVGKSRTYPYSffiASVDEKRTQYIKS 


2546 


B 


224 


429 


XPFLILLLSPVSTDQANTTTAEfflSQLTPRL 

xtt ttt ncAr a qt f\r\D\i r r\sm>'KTLiv\ri~*r\ r nsT} 
INiLllLbbyijAi>L.yt^KVl YHKNrlKY(jQltiP 

QKAEIWG 


2547 


A 


59 


335 


GLAAGLPETLfflSYCMTVFRFESLDSGVWT 
DDHSEACRNMHVLSVWTASCKAEPNPIWP 
HHPWLSCATWPCWKGFDLPGICFTALSCP 
KIYA 


2548 


A 


1 


1605 


PMYLFLCPPLALVQCALKDPRSKYSLGGR 

TTLnTLQGSGKKNNIPHPSSLSERVMTAKD 

GFVSRCHLLMQPKQQKWSLMYPMEGEVL 

ENGCWPTLQDSLLCTALVDKLLVFLGRCF 

CTAVEWMLVTCRTAAAVSAFLrVGRVSS 

PVCRAVSVQPWTLTADHTPGRYCLKLVCR 

QLCLCPSSTPLTEVFCSKEAFFULDCSNLPH 

ALLPVDSPKGLSKCSNPREKARRKLQGHY 

HVASEVSFVPVRRFPKGEIGANQPGTHRKF 

YHLTHYRQNLKQPDVPHGPJVFDDKDITD 

WQTAKIMREAVATVPEGRRVFSRMTVEEN 

IAMGGFFAERDQFQERIKWVYELFPRLHE 

RRIQRAGTMSGGEQQMLAIGRALMSNPRL 

TTT TM7T>CT /"2T A XHTfr\r\TCT\T'T 1 Cr\T PPAPXjrn P 

LAsL&Jar MAjLArillv^v^Lr JJ i JiivjLKjbQCjMTIr 

LVEQNANQALKLADRGYVLENGHWLSD 

TGDALLANEAVRRGDELTEDRSRSLDGELI 

RSLPCGASYGGLSLRPWSRGHIPQSHQSSE 

SVRVMFINTSKGASnSSSATMPGPLPKHLG 

P i 


2549 


B 


1 


597 


MHVQGKAAILGRHFSISSLLPGALLLLTVIK 

GHTHPEEKSPGAHEKAVTGEPKCLGALPY 

CDSGGKKATKKKDAGEMRSRIKDGVLVL 

KCISLQVGLASWTVSWLRTEATGYTFALLP 

PGTHHTEQTPSKHEQNGAELFCNCVSCFED 

PCPCQVPGTQPGNRLSEEHQASSQADVTNS 

SAPKQPHPPPAPCKGVCSHC 
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Table 8 



SEQ 


Method 


Predicted 


Predicted 


Amino acid sequence (X=Unknown, *=Stop 


n> 




beginning 


ending 


codon, /=possible nucleotide 


NO: 




niirlpnHrip 

II UVlvU 11UC 

location of 
first amino 

III sj *. HU1111U 

acid residue 
of peptide 
sequence 


' mirlPAtirip 

IIULICUIIUC 

location of 

Iflcf amino 

AilllllV 

acid residue 
of peptide 
sequence 


Hplptlfkn ~T*inccil~i1o nnr»l£kr\4~trln i'ncat*finn\ 

ucicuun,— pu&MDie nucieouue lnseruonj 


2550 


A 


278 


451 


MAGTAQLLGLKQLIGLELLTAQCGQITGY 
RDRREELLPPRFIATGPPSCHPPSQTVP* 










2551 


A 


1 


6530 


MWGSDRLAGAGGGGAAVTVAFTNARDCF 

LHLPRPvLVAQLHLLQNQAffiWWSHQPAF 

LSWVEGRHFSDQGENVAEINRQVGQKLGL 

SNGGQELHAVSLEQHLLDQIRIVFPKAIFPV 

WVDQQTYIFIQrVALIPAASYGRLETDTKLL 

IQPKTRRAKENTFSKADAEYKKLHSYGRD 

QKGMMKELQTKQLQSNTVGITESNENESEI 

PVDSSSVASLWTMIGSIFSFQSEKKQETSW 

GLTEINAFKNMQSKWPLDNIFRVCKSQPP 

SIYNASATSVFHKHCAIHVFPWDQEYFDVE 

PSFTVTYGKLVKLLSPKQQQSKTKQNVLSP 

EKEKQMSEPLDQKKIRSDHNEEDEKACVL 

QWWNGLEELNNAIKYTKNVEVLHLGKV 

WPKDISEEDIKTVFYSWLQQSTTTMLPLVI 

SEEEFIKLETKDGPSRSYGKRRKQGVNSLG 

VSSLEHITHSLLGRPLSRQLMSLVAGLRNG 

ALLLTGGKGSGKSTLAKAICKEAFDKLDA 

HVERVDCKALRGKRLENIQKTLEVAFSEA 

VWMQPSWLLDDLDLIAGLPAVPEHEHSP 

DAVQSQRLAHALNDMIKEFISMGSLVALIA 

TSQSQQSLHPLLVSAQGVHIFQCVQfflQPP 

NQEQRCEILa^VIKNKII)CDINKFTDLDLQ 

HVAKETGGFVARDFTVLVDRAIHSRLSRQ 

SISTREKLVLTTLDFQKALRGFLPASLRSVN 

LHKPRDLGWDKIGGLHEVRQILMDTIQLP 

AKYPELFANLPIRQRTGILLYGPPGTGKTLL 

AGVIARESRMNFISVKGPELLSKYIGASEQ 

AVRDIFIRAQAAKPCILFFDEFESIAPRRGH 

DNTGVTDRWNQLLTQLDGVEGLQGVYV 

IAATSRPDLIDPALLRPGRLDKCVY CPPPD 

QDGSSSSDSDLSLSSMVFLNHSSGSDDSAG 

DGECGLDQSLVSLEMSEILPDESKFNMYRL 

YFGSSYESELGNGTSSDLEDESMNQPGPIK 

TRLAISQSHLMTALGHTRPSISEDDWKNFA 

ELYESFQNPKRRKNQSGTMFRPGQKFFDEI 

TELTYLPSFHHKAAPHQAEPGPNSSSASAP 

PPYNPFTTSSPHTQSGLQFRSVTSPPPSAQQF 

PLKEVAGAKGIVKTALETAPTLALPVSSQP 

FSLHTAEVQGCAVGILTQGPGPCPVAFLSK 

QLDLTVLGSPSCLHAVASAALILLEALKIT 

NYAQLTLYSSHNFQNLFSFSHLTHILSAPRL 

LQLYSLF\^SPTTTILPGPDFNLASHIILDTTP 

DPDDCMSLIYLTFTPFPHISFFSVPHVDHIW 

FTDGSSTRPDRHSPAKAGYAIESSTSnEAT 

AIJPSTTSQQAELIALTRAFTIAKGLHVNrY 

TDSKYAFHILHHHAVIWAERGFLTTQGSSU 

NASLIKTLIJKAALLPKEAGVTHCKGHQKA 

SDPITLGNAYADKGVRCAPDPARRPLPLPI 

GLKACHCSCTAKIGGKYRALVGQLKTISV 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
aeletion,=possibie nucleotide insertion) 










ATGLKTQDRTDDGSSQVIEEKNHNGYSVID 

TGTLVEAELEKLPNNWSPQTCELFALSQAL 

KYLQNQKTISILIQKEPSPALGLTPERKGNV 

GHAGKGPLESSSPDPFLCGQEKREKGCRTA 

TSVSITNPINRGPWWTHPGKELTPEHKGN 

VGHAGRDILAKAGAIIHLNIGEGTPVCCPL 

LEEGINPEVWATEGQYGRAKNARPVQVKL 

KDSTSFPYQRQYPLRPKAQQGLQKTVKDL 

KAQGLVKPCSNPCSTPILGVQKPNRQWR\T 

LCHQATQALFNFLATCGYMVSKPKAQLCS 

QQ/RYLGLKLSKGTRALSEEfflQPILAYPHP 

KTLKQLRGFLGVIGFCRKWIPRYGEIARSL 

NTLJKETQKANTHLVRWTTEVEVAFQALT 

QAPVLSLPTGQDFSSYVTEKTGIALGVLTQI 

RGMSLQPVAYLTKEIDWAKGWPHCLRV 

VAAVWLVSEAVKLIQGRDLTVWTSHDV 

NGILTAKGDLWLSDNHLLKYQALLLEGPV 

LRLCTCATLNPATFLPDNKEKIEHNCQQVI 

X rATT T A A /"*\/"Vr \t\t t^T TT\T rtrr\nr\r rm /'v i »i v/innTt r 

VQTYAAQGDPLEVPLTDPDLTLCTDGSSFV 

EKGLRKVGYAWSDNGILESNPLTPGTSAQ 

LAELIALTWALELGEEKRANIYTDSKYAYL 

VIJiAHAAIWKEREFLTSERTPIKHQEAIRK 

LLLA V QK^lfLb V A V LHCRGHQKGKEREIEE 

NCQADffiAKRAARQDPPLEMLDCQPLV 


2552 


A 


748 


1075 


ILPTSLFFLFCFVFFVCF*DRVLLLSPG\WSA 
VARSWLYCNLSLRGFKGFSCLSLLSNWDY 
RCTPLRSANFVFL/CRDRVSPCWPTSVSNS* 
PQWIHPPWPPKVLGITRV 


2553 


B 


1 


766 


MRPVDPDGTEHSLFCPLTALRGMVNSRIQ 

KSPGKPSVCDVPLPISPGQSSQLHGKVFGQ 

LNAGKAAEFLKSPPDHQAQAASTSGPQKT 

Tl^KRGLRLQPCQLHSAPHSFQLLPLTQiCS 

TWDLRGSAPLHAAQTSLSEFSCHRPDVED 

TLGTKGPDKTQCQSENSTRPQYSPETSQNQ 

PVGKGTDLKVTKLGVPSLMAQDGVNYSV 

KTEAHSTGTTAEPLSSQDRAVRGHNTDSH ! 

VQTPDLGEDTAL 


2554 


A 


47 


923 


KATRFISAAFWLNKQGVSPAKLPHTSWS 

WSLQTLSFLFSGDLAEKSLQCFPCSAMLLE 

LIPLLGIHFVLRTARAQSVTQPDIHITVSEG 

ASLELRCNYSYGATPYLFWMERTVEEAFEL 

LVCIJCPWRVASSLEKKEKEDESFQLLLGSR 

YNVLKGSRGETSEGGAESFSSQSPGENQLY 

SEMQFFYLCEQRAWPTESWVGLINLFFM 

ASWMKHSGKLWSKRNSEELCGTUHTAAQ 

LKDSGTYFPAVEAOFSOPTP^T DPNPSWAr 

SPNPFRERGMLPPQYHLHSFGFSD 


2555 


A 


2471 


2985 


ETSLERERLSFCTGSRTTRS AELKA VGFE A 
ALQEVTTPEWPASQSEAYQTLRQNQAQV 
HNFFFFWGGDSPTLSPRLECSSAISAHCNLR 
LPGSSNSPTSASRVAGTTGACRHARLIFCIL 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
de!etion,=possible nucleotide insertion) 










VEMGFHRVAQAGRELLSSANPPTSASQSA 
GITGMSHHAQPSSQLLISSCC 


2556 


A 


138 


564 


YRE VMVSES *ETP AGARGRP YYFS APGTAP 

\PAINVHPPPPSLSATPHPPQPQPPPPHQHNA 

KARVATIRTKRTSNCRIRSRKVRKSPPEKW 

VGFNRRPKASCPSPPGAARVDVGGETERR 

EQAAAPGEMGKWARPGEEYFHS 


2557 


A 


2 


585 


AAAAPAGGNPEQPvUDYERAAALGGPDGR 

AWGGRSPLPPP AP * AQG APGPRWPPPRAGS 

PAPSPAGCGGGKGGGLVTPGRGGPRAAGR 

EL/RAVRCPCPVRPRPPSKPALGGSLPQPEP 

AAAPGPSIR/PVLPIQTGSvPWRRPKSLRPVL 

GTRVGRTPPLPPP/PDPAGPPPLPLPGPVHPS 

RPPPPTGPWRPARADGRV 


2558 


A 


2 


224 


PRVRVQWAQI^QDKKGEMNSMTSTAGPP 

GSSSAPCATRRNLLQRQHLQRLSGEFKKDP 

ATYSKHLEPLEEERDK 


2559 


A 


43 


267 


GRLWSAMTPGKLKTLCKIDWPALEVGWP 
IJEGSLDRSLVSKVWHKVTYKPRNPDQFPY 
RDT*LELVLDPPPPTHSG 


2560 


A 


233 


692 


DNHPSFPRLPSSRPGTKEVLKEnnSDTTAD 

VIFYPIYRMSEMIFRRIKMPWLWLDLWYL 

MFKEGWEHKKSLKILHTFTNSVIAERANE 

MNANEDCRGDGRGSAPSKNKRRAFLDLLL 

SVTDDEGNRLSHEDIREEVDTFMFEVLYIV 

RFRYH 


2561 


A 


1993 


1379 


SLHLSERADWQYSQRAG/DAVEVFFSRTA 

RDNRLGCMFVRCAPSSRYTLLFSHGNAVD 

LGQMCSFYIGLGSRINCNIFSYDYSGYGVS 

SGKPSEKNLYADIDAAWQALRTRYGVSPE 

NIILYGQSIGTVPTVDLASRYECAAVILHSP 

LMSGLRVAFPDTRKTYCFDAFPSIDKISKV 

TSPVLVIHGTEDEVIDFSHGLAMYERCPRA 

VEPLWVEGAGHNDIELYAQYLERLKQFIS 

HELPNS*RQSK 


2562 


A 


991 


308 


AAASAFKPGLALSDRAFAAWEPSGAAVSR 

SPLSPPSRPFASREPAGFRAALADPPGMPR 

YELALILKAMQRPETAATLKRTIEALMDR 

GAIVRDLENLGERALPYRISAHSQQHNRGG 

YFLVDFYAPTAAVESMVEHLSRDIDVIRGN 

IVKHPLTQELKEWEGIVPVPLAEKLYSTKK 

RKK'EDSPDFSLICNSFTFGQHGREGRICKF 

GLYISMCCRCCLIFLRYF 


2563 


A 

A 


1 


344 


MDKSLLLELPJLLCCFRALSGSLSMRNDAV 
IEIVQCRMCHLQFPGEKCSRGRGICTATTEE 

A PA/TVHT? X/TFlf T?ri<TI\IP WT TPA/rnPT FMfAn 
Av^lYl V \JlSJVi_r JvJvLJ VJ IN i WL1 FlVlvjr^J_JSJ.N v>/\X/ 

VKGIRWSVYLVNFRCCRSHDLCNEDL 


2564 


A 


251 


386 


LQRLECSGTI/SAHCNLCLLGSSNPLASAS*I 
AGTTGTLTGDVDST 


2565 


A 


1164 


1273 


EISNIQQADFPGVLATHPAFSRLLPCLHFIP 
KSANQ 
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ID 

NO: 
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Table 8 



Method 



Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 



Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 



Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 



2566 



867 



156 



2567 



625 



182 



2568 



917 



2569 



2570 



481 



1380 



3344 



677 



PAPVKDEGPMVSASVKDQGPMVSAPVKD 
QGPIVPAPVKGEGPIVPAPVKDEGPMVSAP 
IKDQDPMVPEHPKDESAMATAPDCNQGSM 
VSEPVKNQGLWSGPVKDQDVWPEHAK 
VHDSAWAPVKNQGPWPESVKNQDPILP 
VLVKDQGPTVLQPPKNQGRIVPEPLKNQV 
PrVPVPLKDQDPLVPVPAKDQGPAVPEPLK 
TQGPRDPQLPTVSPLPRVMIPTAPHTEYIES 



SP 



QQGKNQECIRNQHTRAPGRGASPQQGEGK 
TWAWVGHPVPHALVIPGLQRGSARGLAW 
RQLGRAR*PRPPAPPRACRPEEPPYTPGRR 

APGRPAPAPRSACGWAASASRWCRRTVFF 

SO 



EELLCLDVSENRLERLPEEISGLTSLTDLVIS 
QNLLETIPDGIGKLKKLSILKVDQNRLTQLP 
EAVGECESLTELVLTENQLLTLP*SIGKLKK 
LSNLNADRNKLVSLPKEIGGCCSLTVFCVR 
DNRLTRIPAEVSQATELHVLDVAGNRLLH 
LPLSLTALKLKALWLSDNQSQPLLTFQTDT 
DYTTGEKILTCVLLPQLPSEPTCQENLPRCG 
ALENLVNDVSDEAWNERAVNRVSAIRFVE 
DEKDEEDNETRTLLRRATPHPGELKHMKK 
WENLRhTOMNAAKGLDSNKNEVNHAIDR 
VTTSV 



TSKQNAAPLVKYFQEKGLIMTFDADRDED 

EVFYDISMAVDNKLFPNKEAAAGSSDLDP 

SMLDTGEITOTGSDYEDQGDDQLNVFGED 

TMGGFMEDLRKCKIEFIIGGPGSGKGTQCE 

KLVEK YGFTHLSTGELLREELAS *SERSKLI 

KDIMERGDLVPSGTVLELLKEAMVG\SLGD 

TRGFLID\GYPRE\VKQGEERGRRIWRPHS 

WV1CME\CSADT\MTNRL\LQRSRSSLPVDD 

TTK\TMAKRLEAYYR\ASIPVIAYYETKTQL 

HKINAEGTPEDVFLQLCTS*LTLLFSEGKN 
ACLG 



GAYHKHLMELALQQTYQDTCXN CDCSRIKL 

EFEKRQQERLLLSLLPAHIAMEMKAEnQR 

LQGPKAGQMEhnTSTNFHNLYVKRHTNVSIL 

YADrVGFTRLASDCSPGELVHMLNELFGKF 

DQIAKENECMRIKILGDCYYCVSGLPISLPN 

HAKNCVKMGLDMCEAIKKVRDATGVDIN 

MRVGVHSGNVLCGVTGLQKWQYDVWSH 

DVTLANHMEAGGVPGRVHISSVTLEHLNG 

AYKVEEGDGDIRDPYLKQHLVKTYFVINP 

KGERRSPQHLFRPRHTLDGAKMRASVRMT 

RYLESWGAAKPFAHLHHRDSMTTENGKIS 

TTDVPMGQHNFQNRTLRTKSQKXRFEEEL 

NERMIQAIDGINAQKQWLKSEDIQRISLLF 

YNKVLEKEYRATALPAFKYYVTCACLIFFC 

IFTVQILVLPKTS VLGISFGAAFLLLAFTT .WC 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










FAGQLLQCSKKASPLLMWLLKSSGIIANRP 

WPRISLTIITTAIIIJVIMAVFNMFFLSDSEETI 

PPTANTTNTSFSASNNQVAELRAQILFFLPY 

FIYSaLGLISCSWFLRVNYELKMLIMMVA 

LVGYNTILLHTHAHVLGDYSQVLFERPGI 

WKDLKTMGSVSLSIFFrTLLVLGRQNEYYC 

RLDFLWKNKFKKEREEIETMENLNRVLLE 

NVLPAHV\AEHFLARSLKNEELYHQSYDC 

VCVMFASIPDFKEFYTESDVNKEGLECLRL 

\LNEHADF\DDLLSKPKFSGVEKIKTIGSTY 

MAATGLSAVPSQEHSQEPERQYMfflGTMV 

\EFAFAL\VGKLDAINKHSFNDFKLRVGINH 

GPVIAGVIGAQKPQYDIWGNTVNVASRMD 

STGVLDKIQVTEETSLVLQTLGYTCTCRGn 

NVKGKGDLKTYFVNTEMSRSLSQSNVAS 


2571 


A 


3222 


5798 


PLLTPLVSKVTAAGVPLFFFFFFFF*DIVSLC 
HPGWSAW*P*LTAASNS*\VKQSSHLSLPS 
SWDNRYAPPRPANYFYYFYFL*RLDLALFP 
KLLLNCWAQVILPSQPPKVLGL*AQSSEGG 
mSGLSUPSPCFLLCNPI 


2572 


A 


1 


666 


ASSTPQVTANEEINVTSTDSEVErVTVGESY 

RSRSTLGHSRSHWSQGSSSHASRPQEPRNR 

SPJSTVIQPLRQNAAEVVDLTVDEDEPTW 

PTTSARMESQATSASINNSNPSTSEQASDT 

ASAVTSSQPSTVSETSATLTSNSTTGTSIGD 

DSRRTTSSAVTETGPPAMPRLPSCCPQHSP 

CGGSSQNHHALGHPHTSCFQQHGHHFQHH 

HHHHHTPHPCI 


2573 


A 


300 


110 


PCGPPQEKGADCHLKACPTAPCTTFRASCC 
SHPASCSRGKQASMSSTSSSATVPLPANEM 
HSG 


2574 


A 


2 


362 


QELERSMAQRCVCVLALVAMLLLVFPTVS 

RSMGPRSGEHQRASRIPSQFSKEERVAMKE 

ALKVFPTWSTSFIQHEWEEYSHLFTIQGS 

DPSLQPYLLMAHFDWPAPEEGWEVPPFS 

G 


2575 


A 


1740 


2026 


ENGSLRPKPTGIPLSSARGNELSPTRRRRRP 
WTPNPAGETMSSVQQQPPPPRRVTNVGSL 
LLTPQENESLFTFLGKKCVGAGRGGRAPPS 
RAAGE 


2576 


C 


363 


692 


MLLWPLTQAQSSEMSCCRLGACFTTSLLHQ 
IPATALLEGNLDITLTVQLQILDAHNFPYRL 
CLIDRCICFISSSTYPQIDGLKSSRDIGDKISF 
VRSNGSINMGKPFNF 


2577 


A 


1 


2169 


MEGLNWLSLLAFIFLLCWMLSALKHQTPN 

SSAFGLUDIJIQWFATGSRMNKNNKPSSFI 

AIRNAAFSEVGIGISANAMLLLFHILTCLLK 

HRTKPADLIVCHV ALUfllLLLPTEFIATDIF 

GSQDSEDDKHKSVIYRRNRQSQHFHSTNL 

SPKAPPEKMATQTILLLVSCFVTVYVLDCV 

VASCSGLVWNSDPVRHRVQMLVDNGYAT 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 

miplpntf rip 

location of 
Grst amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

mif Ipritf rip 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
ueicuon 9 = possiDie nucieouae insertion j 










ISPSVLPRLTAPNEWRASVYLNDSLNKCSN 

GRLLCVDRGLDEGPRSVPKCSESETDEDYI 

VLRAPLREDEPKDGGSVGNAALVSPEASA 

EEEEEREEGGEACGLERTGAGGEQVDLGE 

LPDHEEKSNQKVAAATLEDRTQDEPAEES 

CQIVLFQNNCMDNFVTSLTGSPYEFFPTKS 

TSFCRESCSPFSESVKSLESEQAPKLGLCAE 

EDPWGALCGQHGPLQDGVAEGPTAPDV 

WLPKEEEKEEVIVDDMLANPYVMGDEGE 

EEEEEFVDDTLANPYVMGVGLPGRGGEEE 

EEEEWDDTLASLYKMGEEHRHKGLAPL 

WEGGQKPSQKLPPKKPDLRQVPQPLASEV 

OfiPPATID A \A/TC/*iDDT T3 A OT> A T "D A VXiTi A CT* 

r yKKl^JlJKA V V 1 JBurKr LJlAoKALr AKPKArl 

LYPRSFSVEGQEIPVSISVYWEPEGSGLDDH 

RIKRKEEHLSWSGSFSQRNHLPSSGTSTPS 

SMVDIPPPFDLACrnCKPlTKSSPSLLIDSDS 

PDKYKKKKSSFKRFLALMFNKMERPGTM 

AWAPWPQTT rjQ 


2578 


B 


1 


360 


MHLLQAALLLAVPCLLCYVAVGYAFSVLL 

TLLLTAPALLPDDFEGFNIREKTGWYGKKE 

GMVTLSNPQVAREKEQFNDLYFNAKQAE 

QKGYLNTARREASIAFKVTETTHNKSGLIT 

ES 


2579 


A 


1 


1036 


ATVGGREIYVKGFVHYKVRALFPCEKPPRP 

TEMSRHHSRFERDYRVGWDRREWSVNGT 

HGTTSICSVTSGAG/ERHSQQPQRPARPPAA 

ARGALPAAHPGYSSCSL/RPPAAARPSPAS 

WPALRLRSPPRLPASPKGTVSPRDWRPASG 

GGRRLSISPHPG/ITDEPPSKQMRESDNPGT 

vjr W \urKWrru 1 orr'on 1 rMhwr SLPPSVP 

GCERPGPGHWGDPLTASPRGAPAPADARP 

L\PLPQPPSQPLSS\GWSTCLPRPCMPALSP 

WPCPHCPVWGRWPAQDPPLWATATWQG 

PCCLHRRQPSRPPLSPWPLPPMGPPQPTRP 

TGCRCCGPLAWGSMSSPTRGTPE 


2580 


A 


1 


1535 


MEEKTNVQLPPGQTEQHVEIHIMNFCSKN 

HHRITPEKPKELTDPFKEAACCCKLYEIDK 

KLYRMAEWIKIHKPSICCLQETHLTHKDSH 

KIXVSITFKDLAVRFSEEEWRLLEEGQREF 

YRDVMRENYETLVSVEPGRAVGGGSHAD 

EGQEPAGCG/VSPGPGAAGEGDPRVLVWR 

SQGRYGQPRERNGRGASLDGERASPEAA/D 

GKRALPSPRPAQLPSRRPYQPAPPGVPTPTD 

SSCSSGPTGDGVQGSPLPIRISPGNSPL/PRP 

HQLSEGNPCAWAPAPRDIPKLLATSP*PGH 

VQANQSRPGAWEPALGRSDQRACSASGSA 

ELCERWPQQAP/APPEEPPPASPHPAAPTG\ 

PGFWESCGEPGAAVPGKGSAPKPSPLHCLE 

SALRGILP\EGPCASPAWEAPAPAPAPAPAR 

ASAA/AEGEDPRPEPELWKPLPQERDRLPS 

CKPPVPLSPCPGGTPAGSSGGSPGEVAPGEO 
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Table 8 



SEQ 

ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










SPGTAAASVQ/VSPAHWPCFS/SPVRYSSGS 
LPGFSAGEKAQG 


2581 


A 


3 


514 


PRLLMEAGPHPRPGHCCKPGGRLDMNHGF 

VHHIRJRNQIAPvDDYDKKVKQAAKEKVRR 

RHTPAPTRPRKPDLQVYLPRHRDVSAHPR 

NPDYEESGESSSSGGSELEPSGHQLFCLEYE 

ADSGEVTSVTVYQGDDPGKVSEKVSAHTP 

LDPPMREALKLRIQEEIAKRQSQH 


2582 


A 


307 


1503 


GGSSARPRASSRRMLSRKKTKNEVSKPAE 

VQGKYVKKETSPLLRNLMPSFIRHGPTIPR 

RTDICLPDSSPNAFSTSGDGWSRNQSFLRT 

PIQRTPHEIMRRESNRLSAPSYLARSLADVP 

REYGSSQSFVTEVSFAVENGDSGSRYYYSD 

NFFDGQRKRPLGDRAHEDYRYYEYNHDLF 

QRMPQNQGRHASGIGRVAATSLGNLTNHG 

SEDLPI^PGWSVDWTMRGRKYYIDHNTNT 

THWSHPLEREGLPPGWERVESSEFGTYYV 

DHTNKKAQY\RHPCAPTCTSV*STTSCHI/A 

S/RQQTERNQSLLVPANPYHTAEIPDWLQV 

YARAPVKYDHILKWELFQLADLDTYQGM 

LKLLFMKELEQIVKMYEAYRQALLTELEN 

RKQRQQWYAQQHGKNF 


2583 


A 


1341 


1015 


LGTRGCLNMAAPLSVEVEFGGGAELLFDG 
IKKHRVTLPGQEEPWDIRNLLrWTKKNLLK 
ERPELFIQGDSVRPGILVLINDADWELLGEL 
DYQLQDQDSVLFISTLHGG 


2584 


A 


1 


741 


VRSMSCPPSWPYCAPCPTNIGESTSPLRKTI 

ETPTLWDPKAPSCSLELPPWVLASPQRSRG 

TALPFLPSNVLPSLALPSTSFLCRPLLSHLV 

TSLLAGPGAHDGHLRKEGWRSTPEMTSLP 

APEHPASPCDSVLCSPDVSMCTLGPAARW 

DAQAKSAPLPPCCTDCKSFPHLQRPWAQP 

HTSQATSVDSGEAGTKGMSQFTVWTWWR 

SRPCETRQGEGIGNWGYSVTPGPPGSQNLP 

ARLDGQGLAS 


2585 


A 


36 


363 


NAHSLPffiWAFCKIENLCGKCVYMCMCSQ 
NKNNQLKFSFIPGRWCASLKMYSKGQRSL 
MYPCRYHQRMLLVSRYLDTVLLDWDPPG 
PLPEGRQHSPGRRQRDLASALLC 


2586 


B 


1 


1107 


MLYWIMPKGKLLWIASFLTRLQGIQHTLP 

RVEEKSIQSVKDDNTYHPHPRPRIAWGSSS 

TVISYSPGEYAFTNGTSRCPSLSLAAGPRLI 

TNGPWEAHEVQRESTIALMKLLQVLEQKV 

RLREGHSLGTVKMSKNTNPMGHVSNPPTS 

YPDELITKQVCPGSHPKRPGEVKHNEEVPT 

SQDRD I C rTQETQJ Yb VRKUsAEDDrT VKN 

YNHEPJsIKFTTPSRKGQQAHRAWLNBCAIPQP 

MPTSATSLLAALVRAAKHRNQQPQDLAQS 

SSHHIYLFITITFGSLRDSELKSKRGPDPQLS 

LELEMVAKAKAVKPENSRRWFSGNQLGSI 

INSPBCKGSAVLEGTFQEKQKWDARLTKGD 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

niirlpntiHp 

1IUUCUUUC 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 

/falaftnn sm/tCClkla minlanfifiA ■mc«a«*^*am\ 

ticicuuu,— possioie nucieonoe insertion} 










CI ATNJVT MP V 


2587 


A 


1 


384 


MACRVLQGLPFACLSSPICSHSALTLDHSL 
LCLNFFFNPYALLPNFYLFGKFQLRIPSSGK 
PFLTSEDQDPAGIFELVEWGNGTYGQVY 
KVRRMVWKYDLHICRAVLGGVEGSRFLV 

V^Ivo JuvJVJ I vJIv^ 


2588 


C 


1 


417 


MLLPLFLLIHTGIGPSYSASDRAEPRPSPGG 
RLTARIWIKGVKEDGGTMQGAVDWGEGV 

PT? O A riT? TT A CUnfA/ A T^V"\17XICCT)XTCT PPCHC 

e»]S\*j\\jj&± 1 AoxUv V AJJJv W Jrio o KIN olAjLrorrS 

PGTPAPGPWVGFCHPCLPASPLSWTATGT 
AATHAQCAERVHNLCRRAKPS 


2589 


B 


1 


198 


MQAGLARAMVLAAGWSRVASAGAAGDT 

9PVPP AT QFlT PTTHTfrnT T \f~DV A \fQWTV CT t? 

LFPITVEL 


2590 


A 


267 


614 


MAVAVLLCGCIVATVSFFWEESLTQHVAG 
LIJLMTGIFCTISLCTY AASISYDLNRLPKLI 
YSLPADVEHGYSWSIFCAWCSLGFIVAAG 


2591 


A 


5 


447 


SSAFRSVLLEMRVSSRTCUDTLQGAVPTYP 
GSGTPALGEKSGSLGLVAWSFPRPGESSST 
APRRSPCCCPWSPSHSSPASFPPLRPSAPAT 
RAPREGLPTPASRAHFPGATAIPKTSGLLIA 


2592 


A 


508 


870 


GHCPVLRWTEKHCRACEKEGMDSSIHLS 
SLISRHDDEATRTSTSEGLEEGEVEGETLLI 

v LoHL/yAo V D i^o JnJJ o vjrJu o JLIN oJJJbLxD VoW 

MEEQLSYFCDKCQKWIPASKELLNSFDLSI 
PV 


2593 


B 


20 


201 


MGRVSGLVPSRFLTLLAHLVWITLFWSRD 

SNIQACLPLTFTPEEYDKQDIHALPAVTEM 

ALFVTVFGLKKKPF 


2594 


A 


79 


243 


MSFICFLNFWPTSAIPLRLWNYCGMNSPS 
RSWDCLCTPLSROSAPVSHMAKVW* 


2595 


A 


178 


1224 


RYRAARNVMKDQRLVFHSKVRSSGYASA 

PHVTMFSPKTNIKSEGKGSSRSRSSCAREA 

YPVECAVPTKPGPQVAAAPTCTRVCCIQYS 

GDGQWLACGLANHLLLVFDASLTGTPAVF 

SGHDGAVNAVCWSQDRRWLLSAARDGTL 

RMWSARGAELALLNRYKQKSKSKLICRLST 

Tf"r A VTiA/TTQT C A V\TTYI7VQTTT\/T A ACD\TDT\/ 
i v ULvx i ol,o A V £sUr I oJll V L, AALrKJN K 1 V 

EVFDLNAGCSAAVTVEAHSRPVHQICQNK 

GSSFTTQQPQAYNLFLTTAIGDGMRLWDL 

RTLRCERHFEGHPTRGYPCGIAFSPCGRFA 

ACGAEDRHAYVYEMGSSTFSHRLAGHTDT 

VTGVAFNPSAPQLATATLDGKLQLFLAE 


2596 


A 


85 


839 


RSGSLMAAAAATKILLCLPLLLLLSGWSRA 

GRADPHSLCYDrrVIPKFRPGPRWCAVQGQ 

VDEKTFUSYDCGNKTVTPVSPLGKKLNVT 

TAWKAQNPVLREWDELTEQLRDIQLENY 

TPKEPLTLQARMSCEQKAEGHSSGSWQFS 

FDGQIFLIJFDSEKPJVIWTTVHPGARKMKEK 
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Table 8 



SEQ 

ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










WENDKWAMSFHYFSMGDCIGWLEDFLM 
GMDSTLEPSAGAPLAMSSGTTQLRATATT 
LILCCLLILLPCFILPGI 


2597 


A 


319 


513 


IELRAVAQGIAQSLGQLLFTQCPLEKKDLE 

GLFLQNNKEGVQKGRDEPLPPLP*ATALSS 

IQAGIQQAR*EGDLEAWQFPVRIHPPDQQG 

NnVTFEPFPFKLFKEFKQAVNQYGPGSPFV 

MGLLKNVAVSSWMffTDWDALTRACLTP 

AQFLQFKTWWADEAGRV 


2598 


A 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFA 
CQRFLGFAVFLLPWASMWLRSLLKPIHVFF 
GAAILSLSIASVISGINEKLFFSLKNTTRPYH 
SLPSEAVFANSTGMLWAFGLLVLYILLAS 

nil rrm -rx 

SWKRP 


2599 


A 


54 


470 


CSTMNPSEMQRIAPPRRQRHRSRAPSAHK 

MNRMVMSEEQMKLPSTKKAEPPTWAQLK 

KLTQLAKKKNLENTKVTQTPENMLLAALK 

TVSTVSAGVPSSSEESDHRERAMMTTWL 

SKRRGKCGEKKEISDCYCVYVERS 


2600 


B 


1 


939 


MALRLVIPALWEAELVGALMLAALSHLHR 

FLLSMWVLPPGTFTDAFPGLLFHFPRRSQK 

DCLLGLSKSDQRAMACYFGILLIVSATLCF 

GMNYYLDEFANLLDELLMKINGLSDSLQL 

PLLEKTSNNTGEARTEESPLVDISSYQAAE 

MVMMARTLATCLQHAQGLGFEACLPILSA 

PHALSHWTLTTCLWQLGFMSAVLILKYTR 

ALLAQGQFSGPFVIDKGVRLELIGLISRVW 

EVSEQENSKEEVYRHEEGITVISDLLLGRQ 

WQQGHKGICLQLMLPFSRGKHRTSGAFLM 

FSLELFTVAQLVPISGS 


2601 


A 


1 


698 


VLNPLGKP *HDTPAWHEEGYPFPTAPP VDP 

FAKIKVDDCGKTKGCFRYGKPGCNAETCD 

YFLSYRMIGADVEFELSADTDGWVAVGFS 

SDKKMGGDDVMACVHDDNGRVRIQHFY 

NVGQWAKEIQRNPARDEEGVFENNRVTCR 

FKRPVNVPRDETTVDLHLSWYYLFAWGPA 

IQGSITRHDIDSPPASERVVSIYKYEDIFMPS 

AAYQTFSSPFCLLLIVALTFYLLMGTP 


2602 


A 


2 


319 


FYLFILFLFFVFLVETGFHHVGQAGFELLTS 
SDPSALASQSAPJTGMSHHAWPNFCLLSRD 
QVSPCWPGWS*TPDLR*STFLGLPKC*LQA 
♦ATVPSAGEPQCGQ 


2603 


A 


147 


773 


MGLGARGAWAALLLGTLQVLALLGAAHE 

SAAMAASANIENSGLPHNSSANSTETLQHV 

PSDHTNETSNSTVKPPTSVASDSSNTTVTT 

MKrlAAbJNl 1 IrdMVatrsMlol IJ-KolrK. 

TTSVSQNTSQISTSTlvrrVTHNSSVTSAASSV 

ITlllMHSEAKKGSKFDTGSFVGGrVLTLG 

VLSELYIGCKMYYSRRGIRYRTIDEHDAII* 


2604 


A 


2 


331 


WWSSPrTARDAIXiIKHTMVKIRPLSQATR 
AAKAKARAYAEFLQPAKERPETSAALARR 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possibie nucleotide insertion) 










LVISALGVRSKQSKTEREAELKKLQEARER 
KRLEAKQREDIWEGRDQSTV 


2605 


A 


549 


641 


CCCCCCCLCFGIHSSKGTHSANSDKWPFDP 


2606 


A 


1 


517 


SCYVCGGTVTGDQ WP *EARELVPTDP VPD 

EFPAQKNHPDNF*VLKVSIIRQYCTAIEGKQ 

FTHSIGRLSCUIQKXYNGTTKTVTWWNSN 

YTERNPFSKFPKLQTWAHPEFHWDWMA 

PTRLYWICGHRAYAKLPDQWTGSCVISTIK 

PSFFLLPDCTGELLGFPVYASHEKR 


2607 


A 


2 


406 


FLVETEFCYVGQAGLELLTSRDPPASASKG 
AGMTGVSHQVQPQ**S*LWT*/PSSVEAGT 
SFGLSFLSSSWALSAQEGCLAVPS/SGSRGL 
LVGALLLWTKPSPQLSPVPASQRLSSLSLM 
PPLPQPQHLTHTSIET 


2608 


A 


2264 


37 


FFFNKNLLFIQKLTPGVFSPIFKKKKKRGGQ 

GFPSQCP*VNSLAIQGWPSRGVSGKRCQKC 

GGPGPLRTHSPLLASPLQPPS/WTTRPVGLQ 

PPGAL\GLTTTRGRAALP*LP*N*MLKPRW 

EQGDFPPGGWAMEAFSRDSLPLQEGIPGIP 

TSPPTPSEK\NKVPETPGALV*ETGCQTEKH 

FRGGDVSTEGDTYACLDVILNVACLDHGK 

SEHSPKSPSTQSEEQTLRGRGQAVADWPPG 

AGACPGPSARLCRGTMGMPSASEHLKRAA 

LGGK/PPLWRGARAAQEAPGSGFCGITAAR 

GLGRGGGRDRSLPGKL**KWPVSSTPPGPG 

RAALPAALGWGCGPTGM/PGLRSASIPSA 

KARSHTCGFKPKG/LKGRTMEEGQTHRRG 

PHA*AQTPSATGQWQQCyPVPLDQRGKSS 

LRQRPKESNLT\GKDLPHPLSPKPPC\RSLPQ 

TPGQSPAEKLQPLVLSPRSPGPAAEQGAD 

WQGPQPJHPSKWPVKVEPLTPSLQDVGGG 

GGVTVGPACSPRGLPMNASGGTLGLAECS 

SQGEQPRSPTRQRHHGRGLPRAGGLLAEG 

GNRGPKC/PPLKHGLMGC*LCKAAARILDP 

GLALTVWEAASHNPSLPCARTPSGSQRALK 

GLGGTRKCCGKGQGVPHDVNSSAGTDPTH 

QQPRNRGCA/GDSDSPSGCWGQANLTTAS 

PATGN*TPGLE*HDVGMEKGLQDQ\QPGPP 

RSADGATETQRGQEAAHNQRARGRTLGS 

YLWSRVGSHSW 


2609 


A 


1 


399 


MDGQARWLTPVIPALWEAEVFIEHMLYAL 

NBLRTVLGRARTLSLNHRCRLLLLSLLVLH 

CVRSVRSWYLFCEAAAEKTLAFAMAEEKP 

KALSMGQERFRFDSQPINETDTPVQVEMED 

roilDVFHQQIGGVY 


2610 


A 

A 


i 

i 


1641 


MGELUMTTEEKHQPFMDTQTAAKGTLLEA 

GPGLDPVCLGHKKVIQRKFWRYSAPGTVP 

TTSAIPGETEWGRLPQWSTAWSETAQHGW 

PAARQSRITVLHQQPQCDPGPEVTSEQLPG 

VINMLTLKYIKVAAHPHGSWNTRVPCLVA 

VLLTPTRLSYYISEIQTTFREYYKHLYENKL 
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Table 8 



SEQ 

n> 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










ENLEEMDKFLDTYTLPRLNQEEVESLNRP 

MTSSEffiAVINSLPTKKSPGPDGFTAEFYQR 

YEEELWFLLPvLFQTIEKEGILPNSFYEASIIL 

IPKPGPJDTTKNENFIO'ISLM^AKTLNKIM 

ANRIQQHSKKLIHHNQVGFISGMQGWFNIC 

KSINIIHHINRTNDKNHMnSIDAEKAFDKIQ 

HPFMLKALNKLGDDGTHLKIIRAIFDKPTAN 

HLNGQKLE AFLLKTDTRQGCPLSP LLFNW 

LEVLARAIRQEKEIPAPADTSSLIAHHPSPS 

YQPWTPVTRTSHSTPTITCYPCLECTPAKW 

LTSVSTMGGGLLSVPQGTVRVSALNYCFIP 

QLGGGPLMASSASSDYVPESDESEPLFTFE 


2611 


A 


146 


411 


LLSPSHPLTAPPPRPPRPPPTRAPGACASSM 
GPPTSKFPKDLTLPGDAALGCGTPATGGEG 
ASSRARSETQRARAPTPGRSWGRAGSA 


2612 


A 


2 


384 


PICLFSRPTLRPSRSKVSLIEGRGANMAAR 

WRFWCVSVTMWALLIVCDVPSASAQRK 

KEMVLSEKVSQLMEWTNKRPVIRMNGDK 

FRRLVKAPPRNYSVIVMFTALQLHRQCW 

CKYELQLRFKJK 


2613 


A 


1 

■ 


626 


SRVEDFVLHLLRALAQDDVVPYFKTEPGL 

PQ1HLEGNRLVLTCLAEGSWPLEFKWMRD 

DSELTTYSSEYKYIIPSLQKLDAGFYRCW 

RNRMGALLQRKSEVQVAYMGSFMDTDQR 

KTVSQGRAAILNLLPITSYPRPQVTWFREG 

HBQIPSNRIAJIXENQLVILATTTSDAGAYY 

VQAVNEKNGENKTSPFIHLSIASFCGNTTQ 

D 


2614 


A 


412 


1 


SNLCLGNSWRWRWAKSRHHCIPTVTLSKR 
SGDIRGSHFSSPQRQRSQRVPGKETARVLR 
AGKQGRGQIPIPCPWPPPPPPPPPGSPGPGC 
RQFHQSLEAKARHPASVREMRGKVKMRR 
ALRRAP ASTRAS SRQPNPK 


2615 


A 


2 


474 


TGPTIKNMDGTFNVTSCLBCLNSSQEDPGTV 
YQCVVRHASLHTPLRSNFTLTAARHSLSET 
EKTONFSIHWWP1SFIGVGLVLLIVLIPWKK 
ICm^SSAYTPLKCILKHWNSFDTQTLKKE 
HUFFCTRAWPSYQLQDGEAWPPEGSVNIN 
TYSTTV 


2616 


A 


223 


2210 


SLSGFTREASFEMAAQRIRAANSNGLPRCK 

SEGTLDDLSEGFSETSFNDIKVPSPSALLVD 

NPTPFGNAKEVIAIKDYCPTNFTTLKFSKG 

DHLYVLDTSGGEWWYAHNTTEMGYIPSS 

YVQPLNYRNSTLSDSGMIDNLPDSPDEVA 

KELELLGGWTDDKKVPGRMYSNNPFWNG 

LFDAGTSSFTESSSATTNSTGNIFDELPVTN 

GLHAEPPVRRDNPFFRSKRSYSLSELSVLQ 

AKSDAPTSSSFFTGLKSPAPEQFQSREDFRT 

AWLNHRKLARSCHDLDLLGQSPGWGQTQ 

AVETNIVCKLDSSGGAVQLPDTSISIHVPEG 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










HVAPGETQQISMKALLDPPLELNSDRSCSIS 

PVLEVKLSNLEVKTSIELEMKVSAEIKNDLF 

SKSTVGLQCLRSDSKEGPYVSVPLNCSCGD 

TVQAQLHNLEPCMYVAVVAHGPSILYPST 

VWDFINKKVTVGLYGPKHIHPSFKTVVNTIF 

G\HDCAPK\TLLGSGE\VTRQAPNPAPVALQ 

LPQDLKVCMFSNMTNYEVKASEQAKWR 

GFQLKLGKVSRLIFPITSQNPNELSDFTLRV 

QVKDDQEAILTQFCVQTPQPPPKSAKPSG 

QRPvFLKKNEVGKIILSPFATTTKYPTFQDRP 

VSSLKF 


2617 


B 


10 


462 


MSGWLGLVSSLHRLLVSPCPGRTVGLQRR 

KRIJCSGSSRMSFPVTRRPPvEQTPHPDIVAAI 

PSGTDDFQGHRSKEKENWKPMCLNRFILE 

ECIAADDFPJRGLEPNPQYLQGKPTQVSES 

LRLLRNDTQDPNIKTRYIMNLAKTIQRSPD 


2618 


B 


1 


406 


MIIIPKNLNMCALQSKPESRGFGELSQRGN 
VKFNVETLCSHQBCKISRLSAAIHQLDISDIR 
PLTVLLTLCITLALLMRGAQPGMNSGKIPY 
RMFIPNSHSDSELMSFQDSVRHRRGGFQTF 
DCDSQOETFWTWSIX 


2619 


B 


1 


789 


MGRERDPSGWTWLLRCAAAACALLLGSQ 

RQETQLLLSEHSDPDIEHRVRGEPKRTTRW 

LGVECWRQGVINIETKAQEQLQPKGKKVS 

SLLTALPGSIDELSLKRDVKESISLPAVPFQI 

ELLLISKINMQTRLLQLPLKFAVAAASSRF 

NPRPPVIGQLLRGKKSTPWQPDKPIKSPAG 

VTAATLQAGVGWAEEQSGHCAQVHSLGV 

DSSCWSPRSGYTYVHHPVHTPTLCALVGS 

GGERGGGEGEKHIGLEEQEPQKRVLN 


2620 


A 


3 


913 


FMTOWSWLLTFGFQLHNVIPGYPKPDMD 

AMEPSYELIHTQMKTQEWDNSKSILGVQC 

EVQKQLKAFVTLERFDQLYGSTTrSCQQAP 

KTKKFASSGSVFGKGVKFALKDGRVTTDn 

SVANEDGRRVAAILNHAHYLENLHFTIDG 

VDTHYFVKPGPSEGDLAILGLSGGRRTLEN 

GVNVTVSQINTVLNGRTRRYTDIQLQYGA 

LCLNTRYGTTLDEEKARVLELSRQRAVRQ 

AWAREQQRLREGEEGLRAWTEGEKQQVL 

STGRVQGYDGFFVISVEQYPELSDSANNIH 

FMRQSEMGRR 


2621 


A 


30 


2298 


LTRAPDPDRVGLVADFLRLFIPTAKGPVIN 

APLPQRLRSNTAPIRTLHAPSVHRPTGRES 

MPRTRLTRARTSPDTTGSDKTPHPRPKTLPI 

QTRSCADSGKLSEIRKIDDPLQHHLQNQSI 

QKSVKQCHEQNMFGNTVNQNKGHFLLKQ 

DCDTFDLHEKPLKSNLSFENQKRSSGLKNS 

AEFNRDGKSLFHANHKQFYTEMKFPAIAK 

PINKSQFlKQQRTHNffiNAHVCSECGKAFL 

KLSQFIDHORVHTGEKPHVCSMCGKAFSR 



WO 03/080795 



PCT/US02/25485 



Table 8 



SEQ 
ID 

IN US 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucieouue 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 

Holotinn smncciMp niiplf»ntiHp incprtiftn^ 










KSRLMDHQRTHTELKHYECTECDKTFLKK 

SQLNIHQKTHMGGKPYTCSQCGKAFIKKC 

RLIYHHRTHTGEKPHGCSVCGKAFSTKFSL 

TTHQKTHTGEKPYICSECGKGFIEKRRLTA 

HHRTHTGEKPFICNKCGKGFTLKNSLITHQ 

QTHTGEKLYTCSECGKGFSMKHCLMVHQ 

RTHTGEKPYKCNECGKGFALKSPLERHQRT 

HTGEKPYVCTECRKGFTMKSDLIVHQRTH 

TAEKPYICNDCGKGFTVKSRLIVHQRTHTG 

EKPYVCGECGKGFPAKIRLMGHQRTHTGE 

KPYICNECGKGFTEKSHLNVHRRTHTGEKP 

CNECGKGFTMKSTLSIHQQTHTGEKPYKC 

NECDKTFRKKTCLIQHQRFHTGKTSFACTE 

CGKFSLRKNDLITHQRIHTGEKPYKCSDCG 

KAFTIXSGLNVHQRKHTGERPYGCSDCGK 

AFAHLSILVKHKRIHR 


2622 


B 


1 


2034 


MKLMETLNQCINAGHEMTKAIAIAQFNDD 

SPEARKTTRRWRIGEAADLVGVSSQAIRDA 

EKAGRLPHPDMEIRGRVEQRVGYTIEQINH 

MRDVFGTRLRRAEDVFPPVIGVAAHKENT 

LLPFYLGEKGDVTYAIKPLAGRGLTYFFLS 

GSARIEMiLMGKFVERKLATHTTLSFDWPL 

ETTPQLLPPHILSPVFASASPSRCWRVASGK 

YCKVFRGSGFQAQXIPQPTLRDPHYVEDK 

GHKYLVFEANTGTENGYQGEESLFNKAYY 

GGGTNFFRKESQKLQQSAKKRDAELANGA 

LGIffiLNMDYTLKKVMKPLITSNTVTDEIER 

ANVFKMNGKWYLFTDSRGSKMTIDGINSN 

DIYMLGYVSNSLTGPYKPLNKTGLVLQMG 

LDPNDVTFTYSHFAVPQAKGNNVNRFTQF 

RLSETKEITNPYAMRLYESLCQYRKPDGSG 

IVSLKIDWIIERYQLPQSYQRMPDFRRRFLQ 

GQFDHAASPVERGHLRKIPFRGGTRESRER 

r!T QEAfiVT PPE AfWlAOK'PRPWrK'nPT FTfT 

GLETLHCDSRRYPCRSNWVWICTVKEGGR 
EGRGGRGRRVQLAAVAGTVAPAAAPKNP 
PPRFRWSVWARDGVKERVPLQAGVGGGQ 
AVDRRFTARRSROWT T RTWDSIGRDRSLG 
GNGFFTTADORFDFAVLWLVAFRINSDKL 


2623 


A 


513 


796 


TGTAWTPPPPPLTTGAPCTPPPRCTARGRT/ 
PGDSHLGGGPAATAGGPRTSPMSSGGPSAP 
GMRPPASSPKRNTTSIJLNSGLEPTFSFRITF 

OT-T1V1 


2624 


C 


60 


472 


MPLLEYARNMLRTWSSLPWTRFRVCLLSL 

SIJT1.WANRLEDSRSCQPNPMSLTTLPGHRL 

KEAVWLPAPSRTMSPHLDPNQLGILLRVLR 

KEKEDGDYPDMMATHPSSRYEACSSGITL 

AAPPTHGPRPTDPRIGPAP 


2625 


A 


1 


1322 


MAIIJKVIYRFNAIPIK1JPVTFFTELGKTTLR 
FIWNQKRACIGKSVLSQKNKAGGITLPDFK 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possibIe nucleotide insertion) 










LYYKATVTKTAWYWYQNRDIDQWNRTES 

SEIMLHIYNHLIFDKPDKNKKWGKDSLFNK 

WCWENWIAICRKLKLDPFLTSHTKINSRW 

KDLNVRPKTIKTLEENLGNTIQAIGMGKD 

FMTKTPKAMATKAKIDKWDLIKLKSFCTA 

KETTIRLLGRPPALFTASSSVLKQLALEGILI 

LDSRALLGFLYEARHSHSNSPNHDAQNAT 

SKKbnRDGYDKIYRQEQVLARMEEKTLITA 

GGNVKWCSHFRKQIGGQWLTLETKTKTPQ 

PFSSTSQISTDKDKGLNPQLLKMDPGHMG 

WCTPGMGIPWQLSSDDRVWVLAAAGSGR 

HPGSGFKSL/PGLLHEGSYGH****S*I*GGN 

S*GSSGGPQCISGEERVFRWQSI 


2626 


A 


129 


329 


VSNTVDPHQTVGLSTQEPGDIFTYSEFDGIL 
GLAYPSLASE*SVPVLDNTMQRHLVAQDL 
FSVYMSR 


2627 


A 


43 


456 


EFFHHVGQDGLDLLTS*SAHLGLLKCWDY 
RREPPRPASDGHY*TDATGSLPSSGTT*IRT 
KPSQAPASWGLWNLAHHPPRSHPSCPMAN 
LICSTLSSF DGGSr GTGP G G WCPLGLSGSPA 
RAVFKDSSCSLHPLATGI 


2628 


A 


3 


290 


RQGFPLCNHKGTVTADLQPLPPGLK*ISHL 
SLLSSWNYRCTPPHPADF*FFVERRSHYVA 
♦ACLELLCSSDLPALISQRVGITGMSTTPGPI 
CLL 


2629 


B 


1 


804 


MVIVGLAAGVLLVGPGDGGLISEGWRED 

LMCGVWSAGTWSVGTAERCLEKPGALHV 

IEGPLDSWDGPVMPNGPVKNHKGEQQEVP 

SKHPQMALEICLCLDFLYYPFLRGDASAGP 

VTWCTrSDTlILQQHRTLTSQGVDDFLKAK 

ATFKASDFIDALVLSKDLNSGGRMELEIKC 

LDCVLELDLEGSGEPWKVLDKGVTVSYVF 

EM riEGCLEGVNKJSQETREGACGAGLEMA 

KEGSCLDERSSGTVSGYTQVSSELVCSGFL 

SPG 


2630 


A 


322 


549 


GGGSSPRELAGAAGLTVTSQAVAARRQQP 
SFSRARAPAHSLRAALSLASSARSWGAVSR 
DRCjrCrrAlMYQSSNKC 


2631 


B 


1 


384 


MLVPVLILSPCLVGIEPWEVSPHTNSTSSYE 
STPKSYPLGTAAKAASGQSPSTTSPLPETAP 
STLHERGLENWCSDKDLRQATGYSAAEK 
SKPPGLClKAFCPEAffDAQDWVKCQPLGS 
LSALNF 


2632 


A 


1 


275 


KTSQDTBCPSVLWKDVNSNLWCRPHDLLT 
WGRGYACVHIPSGPLGIPVQCIKPYHGMA 

PGWGM 


2633 


B 


56 


3476 


XGKPEKFSFGLLDLPFRVGVPFN1PLEFQDE 
FGHTSQLVTDIQPVLEASGLSLHYEEITNGP 
NCV1RGVTAKGPVNSCQGKVAPNLPVYW 
DCSSSGTSELTGSAIQVQNIKKDQTLKAPJEI 
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Table 8 



SEQ 

ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










PSCKDVAPVEKTKLLPSSHVARLQIFSVEG 

QKAIQIKHQDEVNWIAGDIMHNLIFQMYD 

EGEREINITSALAEKIKGLLPDVQVPTSVKD 

MRYCQVSFQDDHVSLESAFTVSMLELLQL 

MVSLKTSNLLNNFRPLPDEPKHLKCEMKG 

GKTVQMGQELQGEWmTDQYGNQIQAFS 

PSSLSSLSIAGVGLDSSNLKTTFQSIPVINGR 

DLQNPITVQLCDQWDNPAPVQHVKISLTKA 

SNLKVKAIYNKSIIEGPIIKLMILPDPEKPVR 

LNVKYDKDASFLAGGLFTGYVRPVPVPRS 

LNSDISYFGVGGKQAVFFVGQSARMISKPA 

DSQDVHELVLSKEDFEKKEKNKEAIYSGYI 

RNPJCISMFEKGKVPKIVNLREIQDDMQTLY 

VNTAADSFEFKAHVEGDGWEGIIPYHPFL 

YDRETYPDDPCFPSNNFGISFVHSLEVILXL 

KDEDDEDDCFILEKAARGKPJPDFECFWNGR 

LIPYTSVEDRGLAPIECYNRISGALFTNDKF 

QVSTNKLTFMDLELKLKDKNTLFTPJLNG 

QEQRMKIDREFALWLKDCHEKYDKQIKFT 

LFKGVTTRPDLPSKKQGPWATYAAIEWDG 

KIYKAGQLEPQALYDEVRTVPIAKLDRTV 

AEKAVKKYVEDEMASLWELGYKPVQHMT 

VI^TAGNCNTTFWKKINITVILRCRSLTKV 

LLATERTFETAGVGGLILGQVEEARLKEAQ 

LRNELKIHNIDIPTTQQVPHIEALLKRKLSE 

QEELKKKPRRS CTLPNYTKGSGD VLGKGQ 

STGLGPVEVTQSSPSSRTSEYFWLTKFCWL 

EDWASGESLRLLPLMVEGEGEPVYAEIIW 

QKRDETVKDGVTLYLLQSVNQLLLTATKE 

RIDFLPHYDTLVKSGMYEYYASEGQNPLXI 

YTHVGDREAQAALKLGRWSHPRTPNAVG 

APGPPEGAGGGDAVTSQSALLTFSRTRFAS 

GAHAGAHPVLLRNEEEKGAPALVAPIFSAE 

GPTCSLWWTLRPASTAGLKLPARRVHATQ 

PERAH 


2634 


B 


1 


384 


MLASPLWLQALSLAAGTWRPRLGSGQAG 
NSEMRAGFLPGAGSQVRAQLQDRLPKTTE 
TKGALWPHTELCGMWSIAPGAENQELQID 
SPLLGQLSNQVWREDGYGKAFRLRTLSSM 
GTTEEANENVLI 


2635 


A 


628 


1117 


FFISVINGQVSSVQRLSGVGPACLSCGSANP 

GPPPGTSPGAGAQRR*\PRADGSGSPQWPR 

GARVGGGRLGTGGRGRPGWRQVPRRLSP 

GFGR*GGTGPGPVGTSGKRGPSRRRAPAN 

DKAACWPRFPGQPAS*TGFRGERGVKGFS 

SWGSGWRAWEDGGTVH 


2636 


A 


70 


792 


HGLVLDVRGPLSHAAPYWAPYPAATAAA 
ARTAPLPPRSATV*/SGPQPDFQELRKTWPS 
QCVGMARREPLLPITAIPRVVVETTP'GFAK 
QEPSVAGLRCRGSEAPA*LLHGVHRNVS/E 
TPGPEMGRPG*GNHRQRPGKQRGIPSSGLP 
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Table 8 



SEQ 

n> 

NO 


Method 


Predicted 
beginning 

mi aIaa^i n a 

nucieouae 
location of 
first amino 
acid residue 

sequence 


Predicted 
ending 
nucieouae 
location of 
last amino 
acid residue 
oi pepuQc 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
aeietion,— possiDie nucieouae lnseruonj 










GRCSGSRGPHSSPGQKPHGSTLSGRRGADP 
RPRRRVYLSTPLLCEKKPHHDTHKRKPGM 
GDGNNPCPWNAGLYGQATRFAPLPLCPRR 

Ivxlva/V V o 


2637 


A 


571 


172 


SPLRPLLLALALASVPCAQGACPASADLKH 

SDGTRTCAKLYDKSDPYYENCCGGAELSL 

ESGADLPYLPSNWANTASSLWAPRCELT 

VWSRQGKAGKTHKFSAGTYPRLEEYRRGI 

LGDWSNAISALYCRCS 


2638 


A 


169 


1144 


INYSLEBCHVGALGRVLFSL*RAGCPGMGST 

RERGLYLGKHRGSGGIW*ALAGP*KSRGD 

SVSLTQGHTHVCSRSPR*ADSPPG/SHLSPV 

PHSVEVAGHVLVPATRAAVPCSASAGA*Q 

STYRTGVHQGNPTV*TK/PSRRPSGGVAK* 

FLPSAVRGEPGAKPLVDDLLPGWSLATHG 

KlrrL, VAAruouL W LyKr/vU A VjL/il 1 AuUor 

CPRSTSRPSGPSGVQGCPLG*AGSGASASR 

SEPPGSTSCCPRAP\T*PAAPCVPDWPAGDQ 

WRSHGYLPPSREL*G/WMPPSRPATLPQLA 

FARQRQGNRFDAAFESSGEDFHQMPRVGR 

MG 


2639 


A 


1 


1461 


MRELYSIWLKGYWTEGDWAQSPPRSPREA 

LEGIRVHLRCFKAYGnVLCQCPWNTPLLP 

WKPGTKHYEPVQDLRLVNQATVTLHPTV 

PNPYTLLGLLPAEDIWFTHLDLKDAICSIRI 

APESQKLFAFQWEDLQSGVTTQYTWNWL 

PQGWVLKRVDALFQHLEDCGYKVPKKKS 

QICRQQVRYLGFTIWKGEHSLWSERKQVIC 

SLPEPKTRRQVREFLGAVGFCTLWIPNFAV 

LAKHLYGITKGGNWEPFEWGPLQQQAFLS 

ESPVEHNCVEVLDSVYSSRPDLRDQPWAS 

VDLELYLDGSSFINPQGERCAEYAWTLDA 

\/!L' f fVPT P/^OHPO A CW A "CT TAT TD A T CT CCPH 

CIWKDCMAPLRPRWKGPQTVILTTPTAV 
i?p a THMwnnnFWT PPi?rrnwopATWA 

JSJ\OJLr\l\JlN W K^LJUEt W ivi AJ/ivX ly I I vJJr/V I W A. 

QYGSWGYYNPIYMLNQMIWLQAVLEITTN 
KTGRALTIIAWQETQMRNPTYQDRLALDY 
LLAAEGGVCGKFNLTN 


2640 


A 


254 


418 


MAISWKPTGLPWHSMLQVLLAAWLPGPTP 

TPWQ AT PQT7CPPPQT PPTTX/TPT PISfrV* . 


2641 


A 


433 


3 


ASFFNFSICICKHLEVGPPVGHPAHDDVGG 
RHGPGGR/GSRSPRSLQCAPGGGRRSGCPA 
GSSPASTCPPSPGGSGADRFGPSPPPPSREA 
APTAGAAASSTSSGASCPPVPASSRWGVRS 
R1RSGSGGEREPRDRPSERPRLV 


2642 


A 


2 


798 


WEF AD VEKKGAGRTEFRYPS YVOHIMGD 

T V JU/J. iiX/ V ,1 ^ 1 ViVVJ /\\J JL\ X 4— /A XV T V^X JLLXY.1VJ.1-/ 

IFSQGFGPFRWVCTSGDPQDLAVTDELATS 

VLEEAIADGVKVSVKLQYMDNIRWIREAA 

RHRLWGSQARILYSDQKGRVAIAVAINQ 

AIACRRDCAPVVLSRDHHDVSGTDSPFRET 

SNIYDGSAFCADMAVQNFVGDACRGATW 
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Table 8 



SEQ 

m 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
Grst amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










VALHNGGGVGWGEVINGGFGLVLDGTPE 
AEGRARLMLSWDVSNGVARRCWSGNQK 
AYEIICQTMQENSTLWTLPHKVEDERVLQ 
QALQL 


2643 


A 


1 


2504 


QISSGRELRVIQESEAGDAGLPRVEVILDCS 

DRQKTEGCRLQAGKECVDSPVEGGQSEAP 

PSLVSFAVSSEGTEQGEDPRSEKDHSRPHK 

HRARHARLRRSESLSEKQVKEAKSKCKSIA 

LLLTDAPNPNSKGVLMFKKRRRRARKYTL 

VSYGTGELEREADEEEEGDKEDTCEVAFL 

GASESEVDEELLSDVDDNTQWNFDWDSG 

LVDIEKKLNRGDKMEMLPDTTGKGALMF 

AKRRERMDQITAQKEEDKVGGTPSREQDA 

AQTDGLRTTTSYQRKEEESVRTQSSVSKSY 

EEVSHGLGHVPQQNGFSGASETANIQRMVP 

MNRTAKPFPGSVNQPATPFSPTRNMTSPIA 

DFPAPPPYSAVTPPPDAFSRGVSSPIAGPAQ 

PPPWPQPAPWSQPAFYDSSERIASRDERISV 

PAKRTGILQEAKRRSTTKPMFTFKEPKVSP 

NPELLSLLQNSEGKRGTGAGGDSGPEEDY 

LSLGAEACNFMQSSSAKQKTP\PPVAPKPA\ 

VKSSSSQPVTPVSPVWSPGVAPTQPPAFPTS 

NPSKGTWSSIKIAQPSYPPARPASTLNVAG 

PFKGPQAAVASQNYTPKPTVSTPTVNAVQ 

PGAVGPSNELPGMSGRGAQLFAKRQSRME 

KYWDSDTVQAHAARAQSPTPSLPASWKY 

SSNVRAPPPVAYNPIHSPSYPLAALKSQPSA 

AQPSKMGKKKGKKPLNALDVMKHQPYQL 

NASLFTFQPPDAKDGLPQKSSVKVNSALA 

MKQALPPRPVNAASPTNVQASSVYSVPAY 

TSPPSFFAEASSPVSASPVPVGIPTSPKQESA 

SSSYFVAPRPKFSAKKSGVTIQVWKPSVvE 

E 


2644 


A 


938 


652 


RSSDGHAAETSRSCQLH*VSRSRNHPGPQP 
SGNTLRVRQSLSPPDSRTLASAILAPP/TPLS 
SFRALALQPQEENRREEEMKEEGQVLGAV 
PLRTS 


2645 


B 


182 


394 


MATHPSLLVCQVGLLGAQVPSVRAGMPQS 

RRQTEGAQGMVRNEEGGSLRLSHHQACK 

ATHTQQWTLEVTAQ 


2646 


B 


1 


591 


MTIHILILLLLLAFSAQGDLDTAARRGQHQ 

WQHRGHVCYLGVCRTHRJLAEnYWIRCLH 

QGALGEGQPRAPGPLQLWAPPVARGGSPA 

RFPGFRPAARGLAQCPARWVTSGTARPLL 

GFSLPIWLQRDMAEAHQAVGFRPSLTSDG 

Ac V c.JLoAr V lA^bl i LoULKIs W KKliLbKr W 

VRSGAGRFPSGDPGFCFRDV 


2647 


A 


1 


787 


FQEAAVQLYSHAPHVQLRLKISPGHSPPAL 
GLSFPPGQGRGFSCQLLPASFSWGIPQRPLP 
QREPPGRTRTPAWSCSWGPAIPPVHTLVPA 
PSPGPGADRGGSQGPGLLVQGLPLGSLAP* 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










ALGLPGASADTPVPRRLHSQACCSHGVTG 

*GMG*GDVSPVPVPQGPLGWHLFRVPAGS 

QRSSPffHQVLGGTRQPLGPGPVRKWTELA 

GDTGDKKEASSPKELVGPQRVGGLAGTVT 

LVPHLCCGRRAPPGGLDGAVEIVA 


2648 


A 


2466 


3395 


KALCPCLPVPLVHGNVEVAGPRSGGACPT 

LGLWLFNPPGNHAATLRAHGQPCTALWR 

PLKPSPQGYLEGAARGSAAKRPLQRALVS 

LDPGLGVLAATRLPGPVAGGWETQYMCC 

SAAAGSVGCQVAKQHVQDGRKERLEGFV 

KTrEKJBJLSGDTHPQlYALDCEMSYTTYGLE 

LTRVTVVDTDVHVVYDTFVKPDNEIVDYN 

TRFSGVTEADLADTSVTLRDVQAVLLSMF 

SADTDLIGHSLESDLLALKVIHSTWDTSVL 

FPHRLGLPYKRSLRNLMADYLRQIIQDNVD 

OXloooJC/JJ AvJ/\V^iYLtli-» VlWi\ 


2649 


A 


178 


556 


QSPQEHFHPECGRRDILCQVRQEIRWPNPG 
EVHHLGLEICPVWILQLHLALRTRAPEHPL 
Kl V xiKr OvjIjA V ^KLr V rr r LKlXy Ax^UKjr n, V 
PAAGRPRPARSSPGQWPP*/PAAVAPPVTE 
RPPTPSAA 


2650 


A 


803 


1068 


RAMEPLLLGRGLIVYLMFLLLKFSKAIEIPS 
SGKVKTFSAILLSMDSPFQAGGIFGTPPGLG 
SRILSPSPMVSLGSCCTHRSPICFSP 


2651 


B 


1 


559 


MAERAAGGQLPSQGPVQLPSTRKEKDEQT 

ENQQLFFIRQRTESPGKARPPNLETQTSGFQ 

EPQLTGAEPLRGQCHGLELPLMNFWRCHL 

DKTNLRLKEELJKAEKKSGFWDNLVLKQNI 

QSKKPDEIEGWEPPKLALEDISADPEDTVG 

GHPSWSGWEDDAKGSTKYTSLASSANSSR 

WSLRAAGKAX 


2652 


A 

A 


1 


526 


FRLGRKPR*GGVM*PVWSRGEPGSVGAEA 

G/RS*SAPRRLLHHPAAGLATGLSASGRRS 

ARWKMERASGLSPGGGLGATSRQMSPGT 

QLANPPDHGDKDCLGRISPGSGKQIQAAG 

QLPGPPTSLAPAQGRLRSLTPWGLQTPEHS 

PPTJOTfTHTT OA ATTh AVT PWQTfYMT TTK'PTsJT TV/T 
JEJr JDVjrlOIll^ViAA 1 HA V INITIO 1 viNi^l 1 JSJKJN JL1VL 


2653 


A 


3 


396 


AAYTLLLHAELLQWSDKPCVPHLLQRDSY 
YVYTQQELKEKLYQEnSYFDKGKMWEKA 
KI^KEIAETYESKVFDYEGLGNLLKKRAS 
FYENIIKAMSPQPEYFAVGYYGQGFPSFLR 
NKIFIYRGKEYER 


2654 


c 


1 


507 


MPLTHPNHGPDTLQRW TSSQ Ir iSLoSKLN 

PEPEADAASELIATSELYKQSDPYLDILARV 

YGPPTAAEENLKCLKEQGQAHLRHFLLCK 

MA APT AWT TA AMFFNWTFTRROWOVFFP 

GAREEEKSLKSPRFLALKVLRKGADFQRL 

RLYQANMGQAKLPLALFHPLC 


2655 


A 


178 


1206 


ALMNKCAVSTGRQRCSVMWARACSVFCV 

LTLRNTGAQKHWLTEGAAKEHCVSDDSE 

HFESWRAAQLFESVDAEPMNMESQLHFIM 
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Table 8 



SEQ 

ID 

NO* 


Method 


Predicted 
beginning 
nucieonae 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucieouue 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
ueieuon^^possioie nucleotide insertion) 










PKALRTKKAASDSSKEQVANSRESSPSPKE 
VNDSPRAATKSPESQNLIDGTKKPSLKQPD 
SPRNISSDNSSKGTPSSPAGSTTAEPKVRIKT 

VSVMASVTSLLSSPASAAALSSPPRVPLQS 

A WTM A \rV7X> A TJDT r DV'0\rTTV"D , \/' ATA "CT "OWO 

AVKTAGSQVINLKI^NNTTVKATVIPAAS 
VQSASSmKAANAIQEQAVMMPASSLANA 
KLVPKTVHLANLNLLA 


2656 


A 


215 


389 


KGAGVLQTFGSSESVFCTOVDRELLIFAYQ 
NILLFLKNKRALILETTCFGWVGTVKRT 


2657 


A 


1 


737 


FRGEIAENLPEQDILIQSVCETMVPKLVAED 
IPLLFSLLSDVFPGVQYHRGEMTALREELK 
KVCQEMYLTYGDGEEVGGMWVEKVLQL 
i yt 1 VjlJNliuLMMVUrbubCjrKbMA WKVLL 
KALERLEGVEGVAHIIDPKAISKDHLYGTL 
Latin 1 Ivc W 1 -UVjJLr 1 xl V IJKJsJLlJJo V KLjri-Lv^JvK 

QWIVFDGDVDPEWVENLNSVLDDNKLLTL 
PNGERLSLPPNVPJMFEVQDLKYATLATVS 
RCGMVWFSED 


2658 


B 


41 


166 


MKIAALLGCMMMAARCGTLSAMRDLSFS 
DENRRLAVGTAAAA 


2659 


A 


1 


894 


MPGPMSLWLLLLVLPLSLEHSDLRICFPGQ 

WSMESSSTGFIWTDVRAWQTSNRHVSSW 

REPRHSRMPPGAGLMEPJQAIAQNVSDIAV 

KVDQE.RHSLLLHSKVSEGRRDQCEAPSDP 

KFPDCSGKVEWMRARWTSDPCY AFFGVD 

u 1 JbCor LI YLbxi VE WFCPPLP WRNQTAAQR 

APKPLPKVQAVFRSNLSHLLDLMGSGKES 

LIFMKKRTKRLTAQWALAAQRLAQKLGA 

TQRDQKQILVHIGFLTEESGDVFSPRVLKG 

GPLGEMVQWADILTALYVLGHGLRVTVSL 

KELQR 


2660 


A 


3 


14703 


AAAVSARRAAAGGSRGAGGWGTADASG 

AMAEGGEGGEDEIQFLRTEDEWLQCIATI 

HKEQRKFCLAAEGLGNRLCFLEPTSEAKYI 

PPDLCV CNFVLEQSLS VRALQEMLANTGE 

NGGEGAAQGGGHRTLLYGHAVLLRHSFS 

GMYLTCLTTSRSQTDKLAFDVGLREHATG 

EACWWTIHPASKQRSEGEKVRIGDDLILVS 

VSSERYLHLSVSNGNIQVDASFMQTLWNV 

HPTCSGSSffiEGYLLGGHWRLFHGHDECL 

TIPSTDQNDSQHRRIFYEAGGAGTRARSLW 

RVEPLRISWSGSNIRWGQAFRLRHLTTGHY 

LALTEDQGLILQDRAKSDTKSTAFSFRASK 

ELKEKLDSSHKRDIEGMGVPEIKYGDSVCF 

VQHIASGLWVTYKAQDAKTSRLGPLKRKV 

ILHQEGHMDDGLTLQRCQREESQAARIIRN 

TTALFSQFVSGNNRTAAPITLPIEEVLQTLQ 

DUAYFQPPEEEMRHEDKQNKLRSLKNRQ 

NLFKEEGMLALVLNCIDRLNVYNSVAHFA 
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Table 8 



SEQ 

ID 

NO* 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
aeletion,=possibIe nucleotide insertion) 










GIAREESGMAWKEILNLLYKLLAALIRGNR 

NNCAQFSNNLDWLISBCLDRLESSSGE-EVL 

HCILTESPEALNLIAEGHIKSIISLLDKHGRN 

HKVLDILCSLCLCNGVAVRANQNLICDNL 

LPRRNLLLQTRLINDVTSIRPNIFLGVAEGS 

AQYKKWYFELIIDQVDPFLTAEPTHLRVG 

WASSSGYAPYPGGGEGWGGNGVGDDLYS 

YGFDGLHLWSGPJPRAVASINQHLLRSDD 

VGKLLPGPRGCPASHSASMGSPCRGCLENF 

NTDGLFFPVMSFSAGVKVRFLMGGRHGEF 

KFLPPSGYAPCYEALLPKEKMRLEPVKEY 

KRDADGIRDLLGTTQFLSQASFIPCPVDTSQ 

VIIPPHLEKIRDRIAENIHELWGMNKIELG 

WTFGKIRDDNKRQHPCLVEFSKLPETEKN 

YNLQMSTETLKTLLTLGCHIAHVNPAAEE 

DLKKVKLPKNYMMSNGYKPAPLDLSDVK 

LLPPQEILVDKLAENAHNVWAKDRIKQGW 

TYGIQQDLKNKRNPRLVPYALLDERTKKS 

NPJDSLREAVRTFVGYGYNIEPSDQELADSA 

VEKVSIDKIRFFRVERSYPVRSGKWYFEFE 

WTGGDMRVGWARPGCRPDVELGADDQ 

AFVFEGNRGQRWHQGSGYFGRTWQPGDV 

VGCMINLDDASMIFTLNGELLITNKGSELA 

FADYEIENGFVPICCLGLSQIGRMNLGTDA 

STFKFYTMCGLQEGFEPFAVNMNRDVAM 

WFSKRLPTFVNVPKDHPHIEVMRIDGTMD 

SPPCLKVTHKTFGTQNSNADMIYCRLSMP 

VECHSSFSHSPCLDSEAFQKRKQMQEILSH 

TTTQCYYAIRIFGGQDPSCVWVGWVTPDY 

HLYSEKFDLNKNCTVTVTLGDERGRVHES 

VKRSNCYMVWGGDIVASSQRSNRSNVDL 

EIGCLVDLAMGMLSFSANGKELGTCYQVE 

PNTKVFPAVFLQPTSTSLFQFELGKLKNAM 

PLSAAIFRSEEENPVPQCPPRLDVQTIQPVL 

WSRMPNSFLKVETERVSERHGWWQCLEP 

LQMMALHIPEENRCVDILELCEQEDLMRF 

HYHTLRLYSAVCALGNSRVAYALCSHVDL 

SQLFYAIDNKYLPGLLRSGFYDLLISIHLAS 

AKERKIMN4KNEYnPITSTTRNICLFPDESK 

RHGLPGVGLRTCLKPGFRFSTPCFWTGED 

HQKQ SPEIPLESLRTKALSMLTEA VQ CS GA 

HIRDPVGGSVEFQFVPVLKUGTLLVMGVF 

DDDDVRQILLLEDPSVFGEHSAGTEEGAEK 

EEVTQVEEKAVEAGEKAGKEAPVKGLLQT 

RLPESVKLQMCELLSYLCDCELQHRVEAIV 

AFGDIYVSKT OANOK'FTIVNFT MDAT N1V/TQ 

AALTARKTKEFRSPPQEQINMLLNFQLGEN 
CPCPEEIREELYDFHEDLLLHCGVPLEEEEE 
EEEDTSWTGKLCALVYKIKGPPKPEKEQPT 
EEEERCPTTUKELISQTMICWAQEDQIQDSE 
LVRMMFNLLRROYDSIGELLQALRKTYTIS 
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Table 8 


• 


SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










HTSVSDTINLLAALGQIRSLLSVRMGKEEE 

LLMINGLGDIMNNKVFYQHPNLMRVLGM 

HETVMEVMVNVLGTEKSQIAFPKMVASCC 

RFLCYFCRISRQNQKAMFEHLSYLLENSSV 

GLASPSMRGSTPLDVAASSVMDNNELALS 

LEEPDLEKWTYLAGCGLQSCPMLLAKGY 

PDVGWNPffiGERYLSFLRFAVFVNSESVEE 

NASVWKLLERRPECFGPALRGEGGNGLLA 

AMQGAHCISENPALDLPSQGYKREVSTEDD 

EEEEEIVHMGNAIMSFYSALIDLLGRCAPE 

MHLIQTGKGEAIRIRSILRSLVPTEDLVGIISI 

PLKLPSLNKDGSVSEPDMAGNFCPDHKAP 

MVLFLDRVYGKDQTFLLHLLEVGFLPDLR 

ASASLDTVSLSTTEAALALNRYICSAVLPL 

LTRCAPLFGGTEHCTSLIDSTLQTrYRLSKG 

PvSLTKAQRDTffiECLLAICNHLRPSMLQQL 

LRRLVFDVPQLNEYCKMPLKLLTNHYEQC 

WKYYCLPSGWGSYGLAVEEELHLTEKLF 

WGHDSLSHKKYDPDLFRMALPCLSAIAGA 

LPPDYLDSPJTATLEKQISVDADGNFDPKPI 

NTMNFSLPEKLEYTVTKYAEHSHDKWACD 

KSQSGWKYGISLDENVKTHPLIRPFKTLTE 

KEKEIYRWPARESLKTMLAVGWTVERTKE 

GEALVQQRENEKLRS VSQAN QGNSYSP AP 

LDI^NWI^RELQGMVEVVAENYHNIWA 

KKKKLELESKGGGSHPLLVPYDTLTAKEK 

FKDREKAQDLFKFLQVNGIIVSRGMKDME 

LDASSMEKRFGYKFLKKILKYVDSAQEFIA 

HLEAIVS SGKTEKSPRDQEIKFFAKVLLPLV 

DQYFTSHCLYFLSSPLKPLSSSGYASHKEK 

EMVAGLFCKLAALVRHRISLFGSDSTTMV 

SCLfflLAQTLDTRTVMKSGSELVKAGLRAF 

FENAAEDLEKTSENLKLGKFTHSRTQIKGV 

SQNINYTTVALLPILTSIFEHVTQHQFGMDL 

LLGDVQISCYHILCSLYSLGTGKNIYVERQ 

RPALGECLASLAAAIPVAFLEPTLNRYNPL 

SVFNTKTPRERSILGMPDTVEDMCPDIPQL 

EGLMKEINDLAESGARYTEMPHVIEVILPM 

LCNYLSYWWERGPENLPPSTGPCCTKVTS 

EHI^LILGNIIiCIIhnWLGroEASWMKRIAV 

YAQPnSKARPDLLRSKFIPTLEKLKKKAVK 

TVQEEEQLKADGKGDTQEAELLILDEFAV 

LCRDLYAFYPMLIRYVDNNRSNWLKSPDA 

DSDQLFRMVAEVFILWCKSHNFKREEQNF 

VIQNEINNLAFLTGDSKSKMSKAMQVKSG 

GQDQERKKTKRRGDLYSIQTSLIVAALKK 

EEVREHLRNNLHLQEKSDDPAVKWQLNL 
YKDV1JKSEEPFNPEKTVERVQRISAAVFHL 
EQVEQPLRSKKAVWHKLLSKQRKRAVVA 
CFRMAPLYNLPRHRSINLFLHGYQRFWIET 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










EEYSFEEKLVQDLAKSPKVEEEEEEETEKQ 

PDPLHQIILYFSRNALTERSKLEDDPLYTSY 

SSMMAKSCQSGEDEEEDEDKEKTFEEKEM 

EKQKTLYQQARLHERGAAEMVLQMISAS 

KGEMSPMWETLKLGIAILNGGNAGVQQK 

MLDYLKEKKDAGFFQSLSGLMQSCSVLDL 

NAFERQNKAEGLGMVTEEGTLIVRERGEK 

VLQNDEFTRDLFRFLQLLCEGHNSDFQNFL 

RTQMGNTTTVNVnSTVDYLLRLQESISDFY 

WYYSGKDHDESGQHNFSKALAVTKQIFNS 

LTEYIQGPCIGNQQSLAHSRLWDAWGFL 

HVFANMQMKLSQDSSQIELLKELLDLLQD 

MVVMLLSLLEGNVVNGTIGKQMVDTLVE 

SSTNVEMIIJCFFDMFLKLKDLTSSDTFKEY 

DPDGKGnSKKEFQKAMEGQKQYTQSEIDF 

LLSCAEADENDMFNYVDFVDRFHEPAKDI 

GFNVAVLLTNLSEHMPNDSRLKCLLDPAE 

SVLNYFEPYLGPJEIMGGAKKIERVYFEISE 

SSRTQWEKPQVKESKRQFIFDWNEGGEQ 

EKMELFVNFCEDTIFEMQLASQISESDSAD 

RPEEEEEDEDSSYVLEIAGEEEEDGSLEPAS 

AFAMACASVKRNVTDFLKRATLKNLRKQ 

YRNVKBCMTAKELVKVLFSFFWMLFVGLF 

QLLFTILGGIFQILWSTVFGGGLVEGAKMIR 

VTKTLGDMPDPTQFGIHDDTMEAERAEVM 

EPGrTTELVHFIKGEKGDTDIMSDLFGLHPK 

KEGSLKHGPEVGLGDLSEIIGKDEPPTLEST 

VQKKRKAQAAEMKAANEAEGKVESEKAD 

MEDGEKEDKDKEEEQAEYLWTEVTKKKK 

RRCGQKVEKPEAFTANFFKGLEIYQTKLLH 

YLARM^YNLRFLALFVAFAINFrLLFYKVTE 

EPLEEETEDVANLWNSFNDEEEEEAMVFF 

VLQESTGYMAPTLRALAnHTHSLVCWGY 

YCLKVPLWFKREKEIARKLEFDGLYITEQ 

PSEDDIKGQWDRLVINTPSFPNNYWDKFV 

KRKVINKYGDLYGAERIAELLGLDKNALD 

FSPVEETKAEAASLVSWLSSIDMKYHrWKL 

GVVFTDNSFLYLAWYTTMSVLGHYNNFFF 

AAHLLDIAMGFKTLRTILS S VTHNGKQLVL 

TVGLLAVVVYLYTWAFhtfTRKFYNKSED 

DDEPDNDCCDDMMTCYLFHMYVGVRAGG 

GIGDEffiDPAGDPYEMYRIVFDrTFFFFVrVI 

LLAnQGLUDAFGELRDQQEQVREDMETK 

CFICGIGNDYFDTTPHGFETHTLQEHNLAN 

YIJFFIMYIJNKDETEHTGQESYVWKMYQE 

RCWDFFPAGDCFRKQYEDQLG 


2661 


C 


54 


350 


MLNSSEQRRPHGVLDSVWPGIHGALCAGR 
WLRTGQLSWDTRHMLARKMVSSSEPQRP 
PTSWSWCCLASTVRPLLVDGSGWGSCRGR 
PAACWKEDGQFF 


2662 


A 


50 


646 


SSALLSSNQTASFGSCSLSLPCSARERTPEG 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possib!e nucleotide 
deletion,=possible nucleotide insertion) 










GGWPGGRLSEPLPAMLLLWVSWAALAL 

AVLAPGAGEQRRRAAKAPNWLWSDSFD 

GRLTFHPGSQWKLPFINFMKTRGTSFLNA 

YTNSPICCPSRAAMWSGLFTHLTESWNNF 

KGLDPNYTTWMDVMERHGYRTQKFGKL 

DYTSGHHSIRHSERGSTNQRSEKV 


2663 


B 


44 


293 


MPVWWRRRRLRARSWALRARPLSLPRAQ 
RSGRLLRRPKGYAPGAPKAHELSPQAICAV 
AFX 


2664 


C 


40 


495 


MVILNALQRRAFLCAANVKIPRLRIKVKTK 

EASAQWKEECNKYLLFLLPVPSAGLLPSI 

MEIADPFSSFGSEDKCYTLTPPLPRHTEKSS 

DSQEKGHFEAGVEPKSRGSTPGQYPGIGCF 

ARFPJEYQIGMRHLTTRPAMHRAQVLFPLS 

F 


2665 


A 


587 


2 


FLTRETGDPTGRSSSHANTQSRFFPDDPPGX 

PLNNLGNTHGCGRRAGRCPGTGPDGP\AG 

CGGPRCWPSGHLAATGD*GPSCGRLGANR 

GEAGPAGFTACSPLSGCRTPYTHHFPASRM 

SCHLNCASPRTYRSQGNRGCERVAQGSQG 

AGGERGAKSQVPVPAPARNKDPAKCRKPR 

NRRPGNSGPWRAYRRQR 


2666 


A 


1 


1853 


RARRLALQCHV CVCALTPGEQSGRRLPGQ 

TWLMFSCFCFSLQDNSFSSTTVTECDEDPV 

SLHEDQTDCSSLRDENNKENYPDAGALVE 

EHAPPSWEPQQQNVEATVLVDSVLRPSMG 

NFKSRKPKSIFKAESGRSHGESQETEHWS 

SQSECQVRAGTPAHESPQNNAFKCQETWR 

L\QPRIDQRTATSPKDAFETR\QDLNEEEAA 

QVHGVKDPAPASTQSVLA\DGTDSADPSPV 

HKDGQNEADSAPEDLHSVGTSRLLI/YHIT 

DGDNPTAVRHGCSL/FSGQSQRFNLDPESA 

PSPPSTQQFMMPRSSSRCSCGDGKEPQTTr 

QLTKHIQSLKRKIRKFEEKFEQEKKYRPSH 

GDKTSNPEVLKWMNDLAKGRKQLKELKL 

KLSEEQGSAPKGPPRNLLCEQPTVPRESGK 

PEAAGPEPSSSGEETPDAALTCLKERREQL 

PPQEDSKVTKQDKNLIKPLYDRYRIIKQILS 

TPSUPTIQEEEDSDEDRPQGSQQPSLADPA 

SHLPVGDHLTYSNETEPVRALLPDEKKEV 

KPPALSMSNLHEATMPVLLDHLRETRADK 

KRLRKALREFEEQFFKQTGRSPQKEDRIPM 

ADEYYEYKHIKAKLRLLAPPAGSYFP 


2667 


C 


147 


398 


MYKAQFLAASPGRCLGLLAASNHHAKSIH 
GFRRLVKTMRNRLCSLCQPFPLPKHLLSLS ' 


2668 


A 


1 


1787 


MSKGESRKCNEENVSKSSKWKVFIVLTPQ 

FLSRDKDQLTKELQKHVKSVTVSCKSPRK 

LLSHTTRLHPPSKGQGENLTHLVDSIKATrW 

CQPVWETVEGQRRRVGNCEDFTNGCDLVG 

SSSLHNMLVCSSYDINRQDTFQKDRTSEKH 
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Table 8 



SEQ 
ID 
IN (J: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

W% 111 ill AA^I J*i A 

nucieouae 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 

rlolotf sin =mrkccihlp niif IpaHHp incprtinn^ 
UclC UU U) pUaSlUlC IlUUvvUUC 1JU9CI llw" / 










LLDSVFTALQDSAGQQWPARLHPQRGEEV 

ADPRGAPSRHVEPENSSPCQGNGEQAGKA 

GARALCGQARRSPATMPPPLTTRSLCEFAV 

FLIJffWIJFPELFHYRKLGEQDSCYGDGGKQ 

ELDPQRLQnCNFTEVYFPHMQEEEAWRQA 

GPGPAEAAD/TSATSRRSTSPTCRRRRPGCS 

GAPSASTTSFRAWGWTQAAKASPPRDNCY 

NSSSLPDDISIJTHDNLHKQHSCSDSLGKK 

QLDPSCIKIJRH*VHLLYLCrKNNRVWTLE 

FMGNIiTWNRNRGAPTSSSARSTCWPRV*R 

JtliiiiL^C/JN v^o c> V V ^vjor AAJr JiJvooisx/rv^ 

KIPLDEVWPH*/DFPVRSPYLLSDKEVCKI 

VQQSLSVGNFAAGLL/LPPRTSSCSTTIFGL/ 

DNKKQLDPTQLRLICH*VEAVYPVEKVEE 

VWHCECIPSNDEQCHCPNRKKCNILKKAK 

JS.V JCrJV 


2669 


A 


14 


425 


RRFREPDAQMLEIPNLTPYTHYRFRMKQV 
NIVGPSPYSPSSRVIQTLQAPPDVAPTSVTV 
RTASETSLRLRWVPLPDSQYNGNPESVGY 
RK.YWRSDLQSSAVAQWSDRLEREFTIEE 
LEEWMEYELQMQAFNAVG 


2670 


B 


1 


825 


MRALKLQRPJCSFWIWAWEAFVQLVNYE 
CKVGEWKGLAHCVSQNNKYRTTYIIAGVP 
NPQEPGYTAGGQLKGNDLTVLHLLVIEGK 
WEAVRKFPFKKYTVNTATVKEARKYWVEE 

OooL-/VJv/\. I KoINr vj i L\£r i ivliv 1 vjrJUr V r/vr r rv. 

LPFGPPCPLSCTHINPKPQAPEADQQLPIHL 
AESHFHHSIKPRIHPSSPCVTRFFLDAEREL 
GIQKAVPWSFTLVKKQKSLGLPSVQDFGS 
VYKMNIWSDVACCDPQLQQPAASAQTSAI 

oV^IjOiV V 1 JDo 


2671 


B 


475 


848 


XRTERVHLRITPGDDSRKRSSASHYRVAGI 
SRLTLSLDREQLYLEQSTEGPEQDKREGKS 
ARSSSREPTGQPRTLLGGMRARKRKTLVL 
GPFPRVISGSNAKMDTLSPACACAFALYGI 
PKPAA 


2672 


A 


3 


765 


LGTVSYGADTMDEIQSHVRDSYSQMQSQA 
GGNNTGSTPLRKAQSSAPKVRKSVSSRIHE 
AVKAIVLCHNVTPVTESRAGVTEETEFAE 
AnnnTTQnPMPTvnA^iQpnFVAT vnwTF<;v 

/\L/\£LJr&lJIlrij\. 1 I V^AOOi UE» V r\±u V W 1JDOV 

GLTLVSRDLTSMQLKTPSGQVLSFCELQLFP 
FTSESKRMGVIVRDESTAEITFYMKGADVA 
MSPIVQYNDWLEEECGNMAREGLRTLW ! 
AKKALTEEQYQDFEVSRLPGIPSSY/DRCLP 

I r\JZl ijOj J-»v>IYl_TvJ-rJC« JLfVJ o J-» 


7673 


A 


o 


413 


EPKSLIQIIKQSrVELKLQAEDSFVLKVVQL 

EELLQVRHSVFIVGNAGSGKSQVLTLASNE 

PJPLNRTMRLVFEISHLRTATPATVSRAGIL 

YINPADLGWNPWSSWIERRKVQSEKANL 

MHJFDKYLPTCLDK 


2674 


A 


379 


17 


SWGVWYKYQPLDLVRRYFGEKIGLYFAW 
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Table 8 




SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide ; 
deIetion,=possible nucleotide insertion) 










LGWYTGMLFPAAFIGLFVFLYGVTTVDHS 
QVSKEVCQATDIIMCPVCDKYCPFMRLSDS 
CVY AK VTHLFDNG ATVFS A VFMA VW ATV 
LMEFGK 


2675 


A 


1 


1833 


MVDSLIARVGVMARGNAITLPVCGRDVKF 

TLEVLRGDSVEKTSRVWSGNERDQELLTE 

DALDDLIPSFLLTGQQTPAFGRRVSGVIEIA 

DGSRRRKAAALTESDYRVLVGELDDEQM 

AALSRLGNDYRPTSAYERGQRYASRLQNE 

FAGNISALADAENISHSDKFDANDPILKDQ 

TQEWSGSATFTSDGKIRLFYTDYSGKHYG 

KQSLTTAQVNVSKSDDTLKINGVEDHKTIF 

DGDGKTYQNVQQFIDEGNYTSGDNHTLRD 

PHYVEDKGHKYLVFEANTGTENGYQGEES 

IJFNKAYYGGGTNFFRKESQKLQQSAKKRD 

AEIANGALVNTQSTTTRRPGSNSLSHLMW 

PVDHQKFQSVTEMCGSILSRDFADFGTTIK 

QDFRLLGQTSVDRLLQLSQGQAVKGNQLL 

PVSLVKRKTTLAPNTQTASPRALADSLMQ 

LARQVSRLESGQDFADFGTTIKQDFRLLGQ 

TSVDRLLQLSQGQAVKGNQLLPVSLVKRK 

TTLAPNTQTASPRALADSLMQLARQVSRL 

ESGQDFADFGTTIKQDFRLLGQTSVDRLLQ 

LSQGQAVKGNQLLPVSLVKRKTTLAPNTQ 

TASPRALADSLMQLARQVSRLESGQ 


2676 


B 


1 


309 


MGKAMLQLLIRAHWTVFPCEHEDNAASV 
SVTLCSDLAGGEWSAVLTGQSWQTEKEI 
DRSSKPPACLVAPQWFCSEVLRVDESYHR 
KYPVQLRPVHIAAK 


2677 


A 


2 


179 


RGKKSVTTVAGPMAQDVESLALCLQALLS 
EDMYRLDPTVLQMPFREEVKTPFPTPGCSE 


2678 


A 


34 


390 


MKRRRQLRARVFAL ALAWSLGPCW ALRV 

AVPKASXTDR.GPQRRLLASLLQENTEILGY 

LLGSVAAFGSWASRIPPLSRICRGKTFPSIH 

LWTRLLSALAGLLYASAIAAHDRHPEYLL 

R 


2679 


A 

• 


568 


3 


SYYERINRQLIEAKMALQDREEKMEKVFD 

DffiTNMNLIGATAVEDKLQDQAAETEEALH 

AAGLKVWVLTGDKMETAKSTCY ACRLFQ 

TNTELLELTTKTIEESERKEDRLHELLIEYR 

KKLLHEFPKSTRSFKKAWTEHQEYGLEDG 

STLSLILNSSQDSSSNNYKS1FLQICMKCTA 

VLCCRMAPL 


2680 


A 


3 


394 


SSRWAFQVLSPSADSARLPGRAPGDRDCTF 
QPSAPAPSKPFLLSTPPFYSACCGGSCRRPA 
SSTAFPREESMLPLLTQDSNSKARRGILRR 

A1NLGLKESQ 


2681 


A 


42 


406 


EPGDPREGEEEEEEDEPDPEAPENGSLPRFV 
PRFNFSIJCDLTRFVDFNlKGRDVrVFLfflQK 
TGGTAFGRHLVKNIRLEOPCSCKAGQKKC 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /=possible nucleotide 

deletion j = possiDie nucieouae msemonj 










TCHRPGKKETWLFSRFSTGWSCGLHADWT 
E 


2682 


A 


10 


932 


LQLCSMWLLRSWVQAEGAVSISDSPFSLH 

QCWAVLHKAWCVFLQLPGGFTFTLNPLSD 

NLLGKRVDSAPSWGPLGSAFRGVHMPCV 

GAAWEGKGPNLLRPSGKLGPSGSRPTPIGQ 

QQLPEVPRAK.GPLGPAAVICQ/HMPAPSTG 

GKRGSFSGRYLSASLELuOLr MAr 1 UJPo/vL 

SAPPSVSRGAR*STREKPGVYASAT*AAE1R 

EGQALGGVPRPSRNG/SGGPLGPDFGPNGPK 

LRRSKAGCPWWHLSSVDAGE*LWKQHST 

AVFSMPGTQPPWRGLITMPISPRGTEPTAH 

Tli^DBODr'T AVCTTA 

r CjrKor LrLA i oJL 1 A 


2683 


A 


1 


416 


NRLTTHSPHSPGPGGRQAPWRRQCRPASC 
PAKSTTWPVTRAPTRPPAWPPPASAPP/RY 
LLEEWFQNCYARYHQAFADPJDQSERQRH 
ESQQLATETQALAQRTQQDSTRTVGERLQ 
DTHSWKSELOREMEALAAETNLLL 


2684 


A 


356 


1356 


TPTTSGRTRKMWPRPGT*PP/ANCSANINLT 

HQPWFQVLEPQFRQFLFYRHCRYFPMLLN 

HPEKCRGDVYLLVWKSVITQHDRREAIR 

QTWARAAVRGWGPSAVRTLFLLGTASKQ 

EERTHYQQLLAYEDALYGDILQWGFLDTF 

FNLTLKEIHFLKWLDrYCPHVPFIFKGDDD 

■« rr-r* t-vttv I > vtT T T7T7T A TNTI /"YT>/"^T7XTT T?\ /'/'~1T\\ /T f~\U 

VFVNPTF^LEFLADRQrQbNLr- Vul) VJLyti 

ARPIRRKDNKYYIPGALYGKASYPPYAGG 

GGFLMAGSIARRLHHACDTLELYPIDDVF 

LGMCLEVLGVQPTAHEGFKTFGISRNRNSR 

MNKEPCFFRAMLVVHKLLPPELLAMW GL 

VHSNLTCSRKLQVL 


2685 


A 


1 


741 


VRSMSCPPSWPYCAPCPTNIGESTSPLRKTI 
ETPTLWDPKAPSCSLELPPWVLASPQRSRG 
TALPFLPSNVLPSLALPSTSFLCRPLLSHLV 

t>ot t A /~»T»/"» AimPUT "D V~E(~iWrD OTD'RA/rTQT P 

TSLIAGP(jrAJlDOrLLJ<J\JiLj WKo I rUM 1 oL»r 
APEHPASPCDSVLCSPDVSMCTLGPAARW 
DAQAKSAPLPPCCTDCKSFPHLQRPWAQP i 
HTSQATSVDSGEAGTKGMSQFTVWTWWR 

CT>Tir^TTD/^I^I3/^TrJ\nA/f^VCVTPf^PPl^QOMT P 

bKr l^li 1 xCv^vjlivjlVJJN Wvjiov i rKjrr\JO\^riLjr 

ARLDGQGLAS 


2686 


A 


396 


687 


TFCPRCGCPSGLAMRLFLSLPVLVWLSrV 
LEGPAPA*GAPEVSNPFDGLEELGKTLEDY 
TREFINRITQSELPAKMWDWFSETFRKVKE 
KLKTDS 


2687 


A 


2 


3794 


PRGPRPGASGSAMWLSPEEVLVANALWVT 

ERANPFFVLRRRRGHGRGGGLTGLLVGTL 

1WVLDSS ARV APYRILHOTODSOVYWTVA 

CGSSRKExTKIxWEWLENl^ 

DITTFVKGKIHGllAEENl^^ 

IO r KJBAELKMRKQFGMPEGEia.VNYYSCS 

YWKGRWRQGWLYLTVNHLCFYSFLLGK 

EVSLWQWVDITRLEKNATLLFPESIRVDT 



WO 03/080795 PCT/US02/25485 

470 



Table 8 



SEQ 

ED 

NO- 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucieonue 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possibIe nucleotide 
de!etion,=possible nucleotide insertion) 










RDQELFFSMFLNIGETFKLMEQLANLAMR 

QLLDSEGFLEDKALPRPIRPHRNISALKRDL 

DARAKNECYRATFPJLPRDFJRLDGHTSCTL 

WTPFNKLHIPGQMFISNNYICFASKEEDAC 

HLIIPLREVTIVEKADSSSVLPSPLSISTKSK 

MTFIJANLKDRDFLVQRJSDFLQKTPSKQP 

GSIGSRKASWDPSTESSPAPQEGSEQPASP 

ASPLSSRQSFCAQEAPTASQGLLKLFQKNS 

PMEDLGAKGAKEKMKEESWHIHFFEYGR 

GVCMYRTAKTRALVLKGIPESLRGELWLL 

FSGAWNEMVTHPGYYAELVEKSTGKYSL 

ATEEffiRDLHRSMPEHPAFQNELGIAALRR 

\0,TAYAFRNPTIGYCQANfNIWSVIiLYGS 

EEEAFWLLVALCERMLPDYYNTRWGAL 

VDQGIFEELTRDFLPQLSEKMQDLGVISSIS 

LSWFLTLFLSVMPFESAWIVDCFFYEGIK 

VILQVALAVLDANMEHLLGCSDEGEAMT 

MLGRYLDNVVNKQSVSPPffHLRALLSSSD 

DPPAEVDIFELLKVSYEKFSSLRAEDIEQMR 

FKQRLKVIQSLEDTAKRSWRAIPVDIGFSI 

EELEDLYMVFKAKHLASQYWGCSRTMAG 

RRDPSLPYLEQYRIDASQFRELFASLTPWA 

CGSHTPLLAGRMFRLLDENKDSLINFKEFV 

TGMSGMYHGDLTEKLKVLYKLHLPPALSP 

E\EAE\SALEATHLFSQRDSSSEASPLASDLD 

LFLPWEAQEALPQEEQEGSGSEERGEEKGT 

SSPDYRHYLRMWAKEKEAQKETDCDLPK 

MNQEQFIELCKTLYNMFSEDPMEQDLYHA 

IATVASLLLRIGEVGKKFSARTGRKPRDCA 

TTjCTMTDTJ A DDT TJAT1 A ADCT i"VQTJ A A /~'T^\T»/~V A 

I JbJtUJlirr AFbLHQDAAKJbLQPPAAGDPQA 

KAGGDTHLGKAPQESQVWEGGSGEGQG 

SPSQLLSDDETKDDMSMSSYSWSTGSLQC 

EDLADDTVLVGGEACSPTARIGGTVDTDW 

CISFEQILASILTESVLVNFFEKRVDIGLKIK 

DQKKVERQFSTASDHEQPGVSG 


2688 


B 


119 


682 


GDKGADEREISGGTDTAAAAQLKIHYWrP 

vjJroI vyiillJsJiVrrs 1 J^LJVDUv^NOoroKv^Aol 

CDRQFWAGGYHRSLADEAYGDEEDLPK 

WGLVHSTRGPAHPTYLLRPLQKDQDSSL 

LRASGGGGGSPSSSTKSEHSCRQIHIPGPFS 

HADITGQKWFPGGVSTEPARNMGFLKPTP 

TPLLRSPKDFR 


2689 


B 


1 


3097 


MAGARVGPAAGARTAVPAAGEVPASPAL 
TDTQKGTGIGHWWAVAPTIQTSVWPKPF 
RGNRISVLGFEPHSLVSADPQQSQYPYFLFP 
EPPSPKPLSMLEDSYAST KIOASAR APPT 

DMDKQERIKAERKRLRNRIAASQVPQAQA 

GAHLAPGKKVKTLKSQNTELASTAACCAS 

SSSLVGGSRERVSESGPfflCAQRAPPRRAL 

ARGRLMPGDTGPRELHRNPSVWWCLLV 

SLLLIGSVVMAVRFOHRNESKFENLDEVS 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deIetion.=DOSsibIe nucleotide insertion i 

wviwhuu) L'WwiHWlW UUVtWwUUV IJUUjVl LI Mill 










MGSVNDRLSFAHHLQEHQFLFPRVAGCRA 

RGTPATPAALGRCWPWPLRPPCPASQRQK 

VAVGPKRMGPSPFRLAATVRQPERPQAPM 

AVPSCPSTPDYENMFVASQQPSTSGMNKG 

KALPAGILQMVTDTSRPNVGGDESLDCLV 

LNRISYTCRSTLSPRPSFSAPGREESGSVMA 

PDDSMGIMRSLGGLSRLTVAAIVRDVTKFC 

DPGPPHPALQETPQMAPSPGAPQPLNPPAP 

PRKRNTASAPVHLRAARDDSEAALYPFLQ 

VSYSLSGHKNNYTYYAWVVGGFRALGYK 

HSTDVCSGVTIQEEMWIRHRFLRAAPISQR 

TRHYHRFLGCSMAGSGCASDLLCCDWRD 

SCCRCSLSAAQATPLSSPRPRPRSAARLSAR 

GAATTAGSVCSGGGEVAGEPGPRRHHVG 

GAEKWGDVQWTPGDCDNWMNINLREVIC 

TSGTGQVLADARVLHPRQHHQYLRIPDEII 

DMVKEEVGPRAAAAEAC'SS'R ^Plf PRHfrW 

RWPRCFGALSCCGGRESDSTCCBCPLPFADP 

QVLHAPEKGVWEAGSRTRPRERAPRSVCP 

GSGPGPGVEATARSCRAGGAEAVEGGTGA 

QASMVNTTGYWARPLQATQGGSAAWQQ 

WGTREASPDDTTTRGLTGAKPESTNSONH 


2690 


A 


1007 


537 


SRKGSSLAAHPLSPSRT ^AVPTAnnnnn^P 

AKPHLVSPGGSEGAIWCGHGQGRGGSGND 

RGGQ\GPGAGGRRGIPTPARGAVIYKTQRR 

EEEGTRGCNQLASLSGPQGATVSPSSGGSS 

PGTCCDRHPLRADTRMMVWGQEPSPSLVC 

FPKLQPDSL 


2691 


A 


1 


1656 


METEPSKAKANDPGSAAEGVVFASISSGLG 

EVTFLSLTAFYPRAVISWWSSGTGGAGLLG 

ALSYLGLTQAGLSPQQTLLSMLGIPALLLA 

SDLRKALDKIAEIKSLLEERRIGHKYLGLRY 

CPPLYVLYTDAFWSVTPYSEVHIAFTILEEV 

SLCDSKTJHIIFVRLAYACPRFTVSAWAASI 

PEYMVRISLLTAQVDMTIIGIAFMPCPRPL 

MPTVAPTAAREMGVHHTGDSAGEKLHRA 

CCGRGRLCREHRVLALPLSSTLPYRDCAPG 

CILHFPPFVHRYEVDDIDEEGKARHTVSLR 

RHPLTRWKANPETDPEALLVKEKTMFSGC 

CNLGDSTANTGSLGNTAKWARVPNYTNM 

QRLWAPNVGLRCYLLDTRLKGQGKECES 

PPMIGOlSICMHTKKRVSSFRGNKTfil KT)VT 

TIJtRH\^TKVRAKIRKPJCVTTKINRHDKIN 

GKRKTARKQLSLSPCSQCLNLVFLLADVW 

FGF1PSIYLWLIILYEGLLGGAAYVNTFHN 

IALETSDEHREFAMAATCISDTLGISLSGLL 

ALPLHDFLCQLS 


2692 


B 


1 


678 


MKTLLARASRFLALPRTSFNALSKSHNLLG 
FKDIRSNVEALAQKTQPSVFPKESVQVTPV 
CYTKGDRESVQKCPLIFRSHSATEQVSIRR 
GVTVRVAKWRGESHfflGGPDVPGLVLDTS 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 

uuucvuuc 

location of 
first amino 
acid residue 
of nentide 
sequence 


Predicted 
ending 

IIUUCUUUC 

location of 
last amino 
acid residue 

Ul r Mr ^ 

sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 

t\ Al Ar~i #in snnccihlo miolA/tttflo incm*fiAn\ 
UcXcUUU) pUaMJJie IlUClcOUUc IllaCrilUIlJ 










YETSPLIHTPALRVYYIGEDIAMEQVTNLA 
FPLLYSNSHRVSEPGELGFWGPGESVMPA 
nAVSVPPTnrpnQvrsvoPT vr pmnv^nT 

ut\ v o v l v/Xxtvjo i \j v yrju v i_jvlv^vjt x o vj i 
GRWISASAMSCnSDRNG 


2693 


A 


22 


334 


ALKHFCLCSLIFSVTTMKFLAVLVLLGVSIF 

LVSAQNP\TTAAPADTVSSLLVLLMMKPLD 

AETTAAATTATTAAPTTATTAASTTARKDI 
pvt PYWvnnT ptsjp 


2694 


A 


3 


435 


RVDPRVRAPRCGDKIKNHMY\KCDCGSLK 

DCASDRCCETSCTLSLGSVCNTGLCCHKC 

KYAAPGWCRDLGGICDLPEYCDGKKEEC 

PNDIYIQDGIPCSAVSVCIRGNCSDRDMQC 

QALFGYQVKDGSPACYRKLNRIGNRFGT 


2695 


A 


120 


1438 


TMNSEDTLRQNLLMGYRQHQAILTAHSTG 

PRRPAHQSSAEGSLVPCSGN\PVPPKG*LW 

ARQGPAEVSGAGKIPASPKTGFPFLFLSSH 

WKLEKGYSPCAQAGCSKGQGLSPQPYLKV 

LIILGYQA*KGS*FFGPSPPSRKVFPSMGTG 

PQRRKFS*PRFPEGLN*PDCGPGTEPPLGCG 

CRGLS*VPRSGREKRAMADP*SQLGGSQL 

GGDFS/*GPEAGRL*VGAQQGPPGVRNRH* 

SPLLTSS *R/PKARSPDESRGKPQSPLPMMS 

T T P/PrrTrPQrrP'HT fTPPT T7HT PP APQTDT AMD 

GPQSMVXGPHSDFYPLPVSPWGSRRLQPTQ 

LCLPDSKLPGASPPGSAKMAAGQVRWNG 

NVAR/PTPPGN*PPSSPPGADPLLSQLDPLRP 

LKWLPSLQFFPKGCGLGCLCPGPPASERSV 
T ^PAPfiVPfrT vhvt npnnvA DTPnni? 


2696 


A 


2 


454 


SGHGSSSGTKSSKKKNQNIGYKLGHRRAL 

FEKRKRLSDYALIFGMFGIVVMVIETELSW 

GAYDKASLYSLALKCLISLSTIILLGLIIVYH 

AREIQLFMVDNGADDWRIAMTYEPJFFICL 

EILVCAIHPIPGNYTFTWTARLAFSYAPST 


2697 


A 


506 


1317 


GRTSSGKAGMWKPGAESWPLHTGAAQV 
MWFEKLYAGLQCVEKYLIYPAWLNALT 
VDAHTWSHPDKYCFYCRALLMTVAGLK 
LLRSAFCCPPQQYLTLAFTVLLFHFDYPRL 

TYIAPWQITWGSAFHAFAQPFAVPHSAML 

FVQALLSGLFSTPLNPLLGSAVFIMSYARPL 

KFWERDY>HXRVDHSNTRLVTQLDRNPG 

ADDNNLNSIFYEHLTRSLQHTLCGDLVLGR 

WGNYGPGDCF 


2698 


A 


86 


820 


MACYLLVAMLLVNLLIAVFNNTFFEVKSIS 

NQVWKFQRYQLIMTFHERPVLPPPLIIFSH 

MTMIFQHLCCRWRKHESDPDERDYGLKLF 

ITDDELKKVHDFEEQCIEEYFREKDDRFNS 

SNDERIRVTSERVENMSMRLEEVNEREHS 

MKASLQTVDIRLAQLEDLIGRMATALERL 

TGLERAESNKIRSRTSSDCTDARLHWPVRA 

ALTSQEREHLSAPKRGLEPWQNILFIQYKP 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possibIe nucleotide insertion) 










AASSST* 


2699 


A 


3 


553 


KASVIVHSDVKPFKCKLCGKEFNRMHNLM 

OllMill^iYaJ^rKCLYCrSKl'llJCGNLTR 

HMKVKHGVMERGLHSQGLGRGRIALAQT 

DGVLRSLEQEEPFDLSQKRRAKVPVFQSD 

GESAQGSHCHEEEEEDNCYEVEPYSPGLAP 

QSQQLCTPEDLSTKSEHAPEVLEEACKEEK 

EDASKGEW 


2700 


B 


123 


719 


MTEEEEWKPMDPSKMRCSFFQNGKESEKE 

V\7TyT n D CT T A f\\ 7TTDT "\ /*" , T"\^~ , C 1 T\ A TT /"\XT A 

KVr 1 KbLLAQ V1LPLVN YRGDGSDATLQNA 

DPFVGKAGLGFVDDSPLKEVRCQRGLMD 

~\r\nrv qmcwtyvcit; a \t~d a t /tt rr TMvn>ccr» 
JN VrLlvo V uJalv I JsJsajiSA VJr AL\sLiAL,Drir olSt 

YQPFLAYPRYVKPSSEIPSILPWKENIELGK 

QATNNSFTEYMLNCAGLDPCHSMCGSRTK 

miTCELARNAESQAPPHTY 


2701 


A 


185 


284 


GQARWLMSVIPALWKAEAGGPLEPRSSRP 

A X\T A T 1 

AW Al 


2702 


A 


718 


305 


SEQEPLLGDTPGSREWDILETEEHYKSRWR 
SIRILYLTMFLSSVGFSWMMSIWPYLQKID 
PTADTSFLGWVIASYSLGQMVASPIFGLWS 
N YKrKK±jrL,lV!slLIb VAA1NCLYAYLHIPAS 
HNKYYMLVARGLLGIG 


2703 


A 


502 


822 


DSKAAQDLEKLHGVNGMSVDEKPDSPVMY 
VYESTVHCTNILLGLNDQRXKDILCDVTLI 

\/T7D VC17D A LTD A WT A A POCVTU7r> A 1 X/PAT 

VJbKJ^rKArlKA V UAACot I r WCjAJL VUvJI 
KNDLWSLPEEVQ*FGLCDC 


2704 


A 


313 


638 


RWRQRWF WCLHCL VLFRITPRTFALSQ CR 
PWDDSRSQDTSMSHSIQWNRMYCNCSMQ 
DEQEADEANGKGPAQVGDRQAWAGR/CR 
oriKKJbu 1 IrvjNFJdrRAiS^RAGWQR 


2705 


C 


431 


838 


MLLHVGTTAHVAVEHLIGGVQDDEDLEM 
1 lUCnUJ3JDJYuulyLJJJ&IN or GAGOLCICj JiKVG 
GPGCCEVURMTPTEDVGEERSDMKGIQLS 
MQERTRCRQFPEGRRHQLGHLLQGGLGRG 
EAWKYHQIWEEGHWLLREQ 


2706 


A 


244 


375 


RGMGRTYRGRHTDSRKSDR**GGRRQKTQ 
KPMSOTVQRKHGTS 


2707 


A 


1606 


228 


GTSGVQQEISRLTNENLDLKELVEKLEKNE 

RIOJCKQLKIYMKKAQDLEAAQALAQSER 

KRHELNRQVTVQRKEKDFQGMLEYHKED 

EALLIPJvILVTDLKPQMLSGTVPCLPAYILY 

MCIRHAVDYTNDDLKVHSLLTSTINGIKKV 

LKKHNDDFEMTSFWLSNTC\RLLHCLKQY 

SGDEGFMTQNTAKQNXEHCLKNFDLTEYR 

QVvL\SDLSIQrYQQLIKIAEGVLQPMIVSAM 

YmJEAmQMNAFHTVMCDQGLDPEIILQV 

FKQLFYMINAVTLNDLLLRKDVCSWSTGM 

QLRYNISQLEEWLRGRNLHQSGAVQTMEP 

UQAAQLLQLKKKTQEDAEAICSLCTSLST 

QQIYKiXNLYTPLNEFEERVTVAFIRTIQAQ 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
tirst amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










LQERNDPQQLLLDAKHMFPVLFPFNPSSLT 
MDSIHIPACLNLEFLNEV 


270o 


B 


1 


A /T O 

468 


MQGLVNYQISIKCSNQFKLEVCLLNAENK 

WDNQAGTQGQLKVLGANLWWPYLMHE 

HPAYLYSWEVRLTAQKSLGPLTSTHSLWG 

SALCPSPRASGNmAHTKAII)PSQPVTFVT 

NVTYAADKGPLWEVAAPSSSQRASSGVTE 

LTRVTPVDLQIE 


2709 


A 


419 


2 


TSNPKNKVGLLDLELNRLTKALFMALVAH 

SIVMVTIQGFVGPWYRNLFRFLPLFSYIITIS 

LRVNLDMGKAWGWMMMKDENIPGTVV 

RTSTIPEELGRWYLLTDKTGPLTQNEMIF 

KRLHLGTVSYGADTMYEIHTK 


2710 


A 


1 


570 


MSAACGQNYTLALMEMGSVFAFGENKMG 
QLGLGNLTDTIPSPAQnYNGQPITKMAFGA 
EFSMIMDCKGNLYSFGCHEYGQLGHNSDG 
Kr 1ARARRTDGYGRLGHAEQDEMVPHLVK 
LFDFPGHRVSQIYTGYTCSFAISEVGGLFFQ 
GATNTSRESTTYPKAVQDLCGWnQSLACG 

VCOTT\7 A TTCO A T> 

KobJi V A 1 xiKAJr 


2711 


A 


574 


737 


AWEGAHVFTTSPSSCHSWVRDYARVGLPP 
LPLPCPQRALLGLWEVWKGAYSPAI 


2712 


A 


175 


2 


MALRHLALLAGLLVGVASKSMENTDTDV 
PAPEVLTRSTAGVRGACASQRGALRCLLG 
P 


2713 


B 


85 


591 


MERGPVTCTQAQTVRGRTGHRRRFGPGA 

HGLREEPEFVTARAGESWLRCDVIHPVTG 

QPPPYWEWFKFGVPIPIFIKFGYYPPHVDP 

EYAEQSCFQAPSFPSPSPAEELRWSARHG 

LCQALDASWFCTGVQRQPWTQPPTGYHL 

AQRAGDLYPVGFPKETYFEKV j 


2714 


A 


1196 


1459 


KQCQRRCLETEVWKLSKLQISTKASNRQD 
RSTFSAPPRKSQLMW*TSLLSYFQKLPQSP 
QPSATTALISQQPSTLNPQPWPGSCPGG 


2715 


B 


1 


888 


MRIRRWSLMFDSVWPMCAFYSWAKASRT 

FLKAD GLPRRKQ WVLVE ALAGGG VLGVK 

QITIQVLFEVLLRRGKESETYTKMYRRLGP 

ERCRRSKYAGVERIVDKRKNKKGKWEYLI 

RWKGYGSTEDTWEPEHHLLHCEEFIDEFN 

GUiMSKDKRIKSGKQSSTSKLLRDSRGPSV 

EKLSHRPSDPGKSKGTSHKRKRINPPLAKP 

KKGYSGKPSSGGDRATKTVSYRTTPSGLQI 

MPLKKSQNGMENGDAGSEKDERHFGNGS 

HQPGLDLNDHVGEQDMGECDVNHATLAE 

NGLGL 


2716 


A 


94 




KWAVDAELNVFYEESVHFDRDLPEFGHV 
IJ)VHGVHVHKDGLTVTSPVLMWVQALDII 
LEKMKASGFEFSQVLALSGAGQQHGSIYW 
KAGAQQALTSLSPDLRLHQQLQDCFSISDC 
PVWMDSSTTAQCRQLEAAVGGAQALSCL 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possibie nucleotide 
deletion,=possible nucleotide insertion) 










TGSRAYEFNLVCDRKHLKDTTQSVFMAGL 

LVGTLMFGPLCDRIGRKATILAQLLLFTLIG 

LATAFVPSFELYMALRFA\GLLPSLDLASA 

MSPY*QNGWGPHGGRRPWSWPSATSPSGR 

WCLRDSPTVS ATGGSFRSP ALRLAYCS S\LL 

LGSARICTLAPDPWEDGRGDTTDPENGLG 

Q*AETLPGAHEPAGPREDRPLRECPGSVQT 

PPAPEGDPDYLLCLVCGQSGVLRPEPPSGG 

LRPGRLSDAAHLWSC*GACPLFQHLHDAE 

VWPQVEP/RWGPWSWVA*CVSSSSSSQQIC 

PWWSPCWLWWGKWPQLLPLPSPMCTLPS 

FSPPSSGRQAWGWWASSHGSGASSHHL*S 

CWESTTLPSPCSSTAASPSWPA/SLCTLLPE 

THGQGLKDTLQDLELGPHPRSPKSVPSEKE 

TEAKGRTSSPGVAFVSLGTSDTLFLWLQEP 

MPALEGHIFCNPVDSQHYMALLCFKNGSL 

MREKIRNESVSRSWSDFSKALQSTEMGNG 

GNLGFWDVMErTPEnGRHRFNTENHKYF 

KGKGAPGHPMPSLKANFDLLACLRGVGSS 

TLLLWPAVLGAQTRQAGVNEGRSQVADF 

LRIPVTGCPEQRRNPPSPPAPLGTGGPAEER 

LQFPGVAGSRRGRGRILRAGGIGRASPGEG 

TGAPRPRAGQGRGGPGKPESGGGGPVALR 

PGDCTCCVLKSQPRQQRRGACSAMAFRVR 

LRVRQSVRPPRGVTVAALQRPETQGPAPSS 

ARPDCGPESRGGLALWRRLRGYASRDRVL 

O^PJICPHAARFPSKRTPSGSPHLHLMSSW 

AVP 


2717 


A 


1308 


369 


LRSNHGEDWSQFIGAAQRETTVSLLPMPH 

TWPVSLSTGSCM/TRGTPILPFINNPQLQVH 

FHR/EDDEHSDIAFHF*VYFGHWVIMNSHE 

C/GAWKCEERSNNMPAEDGRVFELHIIVLD 

NEYQAMVNG/QSLLHSFAHRLLPGSVKMV 

QVWRDVSLNSRCVSSGETVSSSSSFLPPPPP 

PLPLPLLLLLPPLPLPDEALFLSLPSHALPSG 

RCGVLSLCGSHYPQPGGLLQSSAGASGRR 

GAPGVPWQVLVLLTPRGLQGPPPGMRGRV 

VHKPLLVMELGEQPFSFPSVRTATSSASGK 

APPRCPWPGPRALSPSSVP 


2718 


A 


2 


1226 


SLGSTISTDWANHYLAKSGHKRLIRDLQQ 

DVTDGVLLAQHQVVANEKIEDINGCPKNR 

SQMffiNmACLNFLAAKGINIQGLSAEEIKN 

GNLKAILGLFFSLSRYKQQQQQPQKQHLSS 

PLPPAVSQVAGAPSQCQAGTPQQAPGVPV 

TPQAPCQPHQPAPHQQSKAQAEMQSRLPG 

PTARVSAAGSEAKTK.UVJO 1 lAlNlNKKiV^ar 

NNYDKSKPVTSPPPPPSSHEKEPLASSASSH 

PGMSDNAPASLESGSSSTPTNCSTYSGffHS 

GAATKPWRSKSLSVKHSATVSMLSVKPPG 

PEAPRPTPEAMKPAPNNQKSMLEKLKLFN 

SKGGSKAGEGPGSRDTSCERLETLPSFEESE 
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Table 8 



SEQ 

m 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possib!e nucleotide insertion) 










ELEAASRMLTTVGPASSSPBCIALKGIAQRTF 
SRALTNKKSSLKGNEKGKE 


2719 


A 


103 


742 


NANTQRARRREGARLDNLWLEQVISVLPG 

LVTQGFRCHSGPMGRGLEPHPIRGAGAGS 

CQLSIRGRGGRIPAFLTPRRLAPKGGRDLG 

FPAPRGTRCLRHSFCRSIARTVT/RTVRGIR 

GEEARTPGSREMDSWFEDVDVNFTQEEW 

ALLDPSQKNLYRDVMQETFRNLASVGKK 

WKDQKIEDEYK^RRNLRNYVYHFSLKK 

WSWSLYARQT 


2720 


A 


1258 


586 


LLLHSLFPVPRMGNSASNTVSPQEALPGRK 

EQTPVAAKHHVNGNRTVEPFPEGTQMAVF 

GMGCFW GAERKFWVLKGVYSTQVGFAG 

GYTSNPTYKEVCSEKTGHAEWRWYQPE 

HMSFEELLKVFWENHDPTQGMRQGNDHG 

TQYRSAIYPTSAKQMEAALSSKENYQKVL 

SEHGFGPITTDIREGQTFYYAEDYHQQYLS 

KNPNGYCGLGGTGVSCPVGDCK 


2721 


A 


2806 


382 


NEIEKQLNAIRDNKIGEDRAARLDRKMEE 

QQVRLNEAEQKYKDIQDKLEKISEETNAR 

APECMALKADVVAKKRAYNEAEVLYNRS 

LNEYKALKKDDEQLCKRIEELKKSTDQSLE 

PERLERQKKISWLKERVKAFQNQENSVNQ 

EIEQFQQAIEKDKEEHGKIKREELDVKHAL 

SYNQGQLKELKDSKTDRLKRFGPNVPALL 

EATODAYRQGHFTYKPVGPLGACIHLRDPE 

LALAIESCLKGLLQAYCCHNHADERVLQA 

LMKRFYLPWTSRPPITVSECRNEIYDVRHR 

AAYHPDFPTVLTALEIDNAVAANSLIDMR 

GIETVLLIKNNSVARAVMQSQKPPKNCRE 

AFTADGDQVFAGRYYSSENTRPKFLSRDV 

DSEISDLENEVENKTAQILNLQQHLSALEK 

DKHNEELUCRCQLHYKELKMKIRKNISEI 

RELENIEEHQSVDIATLEDEAQENKSKMK 

MVEEHMEQQKENMEHLKSLKIEAENKYD 

ADCFKTNQLSELADPLKDELNLADSEVDNQ 

KRGKRHYEEKQKEHLDTLNKKKRELDMK 

EKELEEKMSQARQICPERIEVEKSASILDKE 

INRLRQKIQAEHASHGDREEIMRQYQEARE 

TYLDLDSKVRTLBCKFIKLLGEIMEHRFKTY 

QQFRRCLTLRCKLYFDNLLSQRAYCGKMN 

FDHKNETLSISVQPGEGNKAAFNDMRALS 

GGERSFSTVCFILSLWSIAESPFRCLDEFDV 

YMDMVNRRIAMDLILKMADSQRFRQFILL 

TPQSMSSLPSSKLIRILRMSDPERGQTTLPF 

o d\ rvrvcx: t\t\t\ rvo 
Kr V 1 ^ tcUUlJ^K 


2722 


A 


1567 


1145 


AEVLGRAVEPPPGRCWSTPPVAPPARSASA 
AAMGVQVEHSPGDGRTFPKRGQTCWHY 
TGMLEDGKKFDSSRDRNKPFKFMLGKQEV 
IRGWEEGVAQMSVGQRAKLTISPDYAYGA 
TGHPGUPPHATLVFDVELLKLE 
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Table 8 



SEQ 
ID 

TOW 


Method 


Predicted 
beginning 
nucieoviue 
location of 
first amino 
acid residue 
01 pepuue 
sequence 


Predicted 
ending 

UUUvU UUC 

location of 
last amino 
acid residue 

nf npntfdp 

sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion^^possible nucleotide insertion) 


2723 


A 


374 


656 


RRVGCRCFHPSQTGTCT*RPPWNVHH*PAT 
CHLAYNRHSWSPHRA/HWHIATAIQLSAH 
VF/ACHYQQLHHYHQHHHHHHHYRHHHH 
HHHHHYCHHH 


2724 


A 

A 


11/1 




PMALW ADGRARHKVGTECECGMHPGLKC 

SGRTLGSQTMLATTPCDSPT*I/SNKNGLRS 

V/SYR*CLINALWLFSISPHILVRCGTESS*L 

LPSLVPSWLP*LVRVR\PLPTGWC*IPSCLKP 

\PPTWSSHHSPQRLP*NPATLVCLQNGTARS 

HSSTPV 


2725 


A 


Q 
O 




G<?FKTGLYLPTSDIDLWFGKWENLPLWTL 

EEAIJIKHKVADEDSVKVLDBCATVPIIKLTD 

SFTEVKVDISFNVQNGVRAADLIKDFTKKY 

PVLPYLVLVLKQFLLQRDLNEVFTGGIGSY 

SLFLMAVSFLQIJIPREDAOTNTNYGVLLI 

EFFELYGRHFNYLKTG 


2726 


A 


214 


32 


MTLRMLVPRLLLTRQLVWFFSAATERDPE 
MMNGffRKLMSFPPSSVTSRRSRRGHHLQS 
L* 


272/ 


A 

A 


z 




WNSDOPATR*OVGDTGSLPSRKGQHFVLT 

GIDTYSRSGFAFPVRHAPAKTSIRGLTECRT 

YCHGMPHCTAS V*GTPFTAKKVW *RAHA 

HGDPRYDHVAHHLEAAGLIRWWNGLLKTP 

LQHQLGGDALQGWARVLQEAVYALNQN* 

V*GW 


2728 


A 


16 


444 


TPSPSPCPXPRPLAALKPVRLHSFQEHVFKR 

ASPCELCHQLIVGNSKQGLRCKMCKVSVH 

LWCSEEISHQQCPGKTSTSFRRNFSSPLLVH 

EPPPVCATSKESPPTGDSGKVDPVYETLRY 

GTSLALMNRSSFSSTSESPTRS 


2729 


A 


37 


655 


AEPAAGAGTLAGDCRAVQGGVHAARPRG 

AKEGHGPADGHGKGGAGTGQERLAGGAE 

vrHAOVRGGAAAPGCRVGGVLRAAKAE* 

GAGRARGRAGIAGGHPAGGHPHQPGQGA 

G*AEDQGQRAPGRGEAAGSGR/GA/GPGA 

GAAGAAAGEGEDQRHRPACQAPRRGGGE 

HEQGGLREVRGGGAGIARGPAGAGRAAG 

PVAGGAATAGAA 


2730 


C 


257 


498 


MQKSEGSGGTQLKNRATGNYDQRTSSSTQ 

LKHIWAVQGSKSSI^TSSPESARKLHPRPS 

DKLNPXTINPVHSDDEVFERG 


2731 


A 


342 


665 


MALDFVNVLLCQLAEVTLGVLREEGASLL 
VALGSALFPSAAAVGKQGSMGVTSHMQC 
PVCQHPRDVLLASPVSHSHACQPQPAGCS 
NCHLGHLTRSPPFQGLLPLLQ* 


2732 


A 


1 


825 


MKRYSYGSVLFTAFDLGYLDPDEVQQGHE 

IGRLFDGTEPIVLDSLKQHYFIDRDGQMFR 

mNFUlTSKLUPDDFKRTLVFILPLAAPFS 

VGLEACPLAGKRIJCGSVCPELEFPLWKKH 

RWSQSLPYXTHAFNEERLQDNKSYIHSVL 

OEPREDTDPEGAGAAPDHRSTYKLLSPALS 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 

nit *>1 ortti H f> 
IlULlcULlUc 

location of 
first amino 
acid residue 

nf npntirip 

%Jk UCUUUb 

sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of neotide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










imGEKNKWLRRYIELLISEREMAAAGSSI 
PSWTSVSIQVKLRKCQLQLLAKEEVATIVL 
DETSGVNGIfflEHQLQCLIQVPKLSAPNIAP 
PTPA 


2733 


A 


135 


438 


GMGYLHAKGILHKDLKSKNVFYDNGKVV 
ITDFGLFSISGVLQAGRREDKLRIQNGWLC 
HLAPEHRQLSPDTEEDKLPFSKHSDVFALG 
TIWYELHAREWP 


2734 


A 


74 


661 


HTHKLVAPRPGLPPTSQWPRDAGRQASGG 

LPSLSTGPPKGPRDGLARGHPAEWLAGSPG 

NNSPTQGSLPPQLDLYAGALFVfflCLGWNF 

YLSTELTLGITALYTIAGMVPAAGRSTQGT 

CKGVRRPPPPTGPREQPRKWPQQEPQKFLP 

VSLLPGARAPSSNLASTGRGPGCCNLHGRP 

ADAHHGGGGCHPDNQR 


2735 


A 


40 


446 


RHLLLSLSAVTGKCSFAPDCGELKLPGAAC 

ACQWADVSSLLL*LCQMRELRCENVATC 

LGIFAGSLGNLLRKEVLHLDWTFKASLLLD 

LICMRSLPGPGTAELLWTAPELLPGPGRPG 

RRTLTGDIFSTGIILQE 


Z /DO 


A 


i 
i 


517 


LVDPRVRGEPGPPSDAVFARDPMRPPGLV 

Rl^QVTDRSNTSITLSWAGPDTQEGDEAQ 

GYWELCSSNSLQWLPCHVGTVPVTTYTA 

KGLRPGEGYFVRVTAVNEGGQSQPSALDT 

LVQAMPVTVCPKFLVDSSTKDLLTVKVGD 

TVRVPVSFEHARRPLGPSTCRRTCLGR 


2737 


A 


3 


437 


NDPRVQKPREEAPAGAAASG*CGR*PGQH 

PAAA*\P*SAGPRRAPTALSPPTAEPSLCPA\ 

PG*PEQPQCSRRPGGQPRDPVGQHRSQPAV 

GPAAGSPLRPCAWSAQRGSPQPDQLPHTPP 

GAAGS*SQLPRPPPSFAQATPSTPP 


2738 


A 


34 


576 


EELCVREHVTGGICGGSQMMWLLGATTL 

VLVAVAPWVLSAAAGERRGGESWRRAGG 

RARSWATGAAMLLGATDAQSGKPSVHFA 

APKIKPDLGSQINQEKVVFWVLSCRLPVAV 

YGSSGAPGSHPREMAVPELCVEFDSFRETH 

QILLVYFVCGPRQLFFQCGPRKPKRVDTLD 

ADEACR 


2739 


A 


2 


410 


CHSTESSSDFILPGDYLLGGLCPLHSGCLQV 
\CSFNEHGYHLFQAMRLAVEEINNSTALLP 
NTTLGYQLYD VCSDS ANVY ATLRVLSLPG 
QHHIELQGDLLHYSPTVLAVIGPDSTNRAA 
TTAALLSPFLVPMLLEQ . 


2740 


A 


2 


417 


STRPEFPGRAPTGFLKLLADKNSELFRKYA 
IJSPSDHRVPRIYVPLKDCPQDFVARPKDY 
ANTLFICRIVDWKEDCNFALGQLAKSLGQ 
AGEIEPETEGILTEYGVDFSDFSSEVLECLP 
QGLPWTIPPEEFSKRRW 


2741 


A 


1 


312 


MAPAADREGYWGPTTSTLDWCEENYSVT 
WYIAEFNSWLMSGFLPTPSSLRDLTASRWV 
RSLPPSRSPAGRQPGPAEELPKASPCPWGK 
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Table 8 



SEQ 


Method 


Predicted 


Predicted 


Amino acid sequence (X=Unknown, *=Stop 


ED 




beginning 


ending 


codon, /^possible nucleotide 


NO: 




nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


nucleotide 
location oi 
last amino 
acid residue 
of peptide 
sequence 


de!etion,=possible nucleotide insertion) 










OT OTIHT? A OTTO A CO/^TJC 1 

SLSRPFASFSASSGPS 


2742 


A 


2 


374 


FRDLQCALYNGRPVLGTQKTYQWVPFHG 

APNQCDLNCLAEGHAFYHSFGRVLDGTAC 

SPGAQGVCVAGRCLSAGCDGLLGSGALED 

RCGRCGGANDSCLFVQRVFRDAGAl'AuY 

WNVTLIPEGA 


2743 


B 


218 


656 


MGPVPLVWAMSQLSLSAKMDRRRTGVM 
MTSTPITWGTLEKTMQEAEKLLERQGQTK 
TPDSMFLAMEESLNVlFVKNri iQFMVCu 
FNPYWLAAKADQLQVWSHTTTASQER 


2744 


A 


85 


396 


MILINFREICLKVLHTPLCVSGGCVLLYILA 
LTCCYTNSLLISHLPPLSLPTETQTHLFMYR 
VLKVRKDIKNHVFHPTYLVAKETETYGEE 
LIPLPPCREHQD* 


2745 


A 


1 


3899 


NRPSSASSTSSKAPPSSRRNVGMGTTRRLG 

SSTLGSKSSAAKEGAGAVDEEDFIKAFDDV 

PWQIYSSRDLEESINKIREILSDDKHDWEQ 

RVNALKKIRSLLLAGAAEYDNFFQHLRLL 

DGAFKLSAKDLRSQWREA\CrrLGHLSSV 

LGNKFDHGAEAIMPTIFNLIPNS\AKIMATS 

GWAVRLnRHTHIPRLIPVITSNCTSKAVA 

VRRRCFEFLDLLLQEWQTHSLERHISVLAE 

TDCKGIHDADSEARIEARKCYWGFHSHFSR 

EAEHLYHTLESSYQKALQSHLKNSDSIVSL 

PQSDRSSSSSQESLNRPLSAKRSPTGSTTSR 

ASTVSTKSVSTTGSLQRSRSDEDVNAAASA 

KSKVSSSSGTTPFSSAAALPPGSYASLDGTT 

TKAEGRIRTRRQSSGSATNVASTPDNRGRS 

RAKWSQSQRSRSANPAGAGSRSSSPGKLL 

GSGYGGLTGGSSRGPPVTPSSEKRSKIPRSQ 

GCSRETSPNRIGLARSSRIPRPSMSQGCSRD 

TSRESSRDTSPARGFPPLDRFGLGQPGRIPG 

SVNAMRVLSTSTDLEAAVADALKKPVRRR 

YEPYGMYSDDDANSDASSVCSERSYGSRN 

GGIPHYLRQTEDVAEVLNHCASSNWSERK 

EGLLGLQNLLKSQRTLSRVELKRLCEIFTR 

MFADPHSKRVFSMFLETLVDFmHKDDLQ 

DWLFVLLTQVLLBCKNGEADLLGSVQAKVQ 

KAU5VTRDSFPFDQQFMLMRFIVDQTQTP 

NLKVKVAELKYffiSLARQMDPTDFVNSSET 

RIAVSRnTWTTEPKSSDVRKAAQIVLISLF 

ELNTPEFTMLLGALPKTFQDGATKLLHNH 

LKNSSNTSVGSPSNTIGRTPSRHTSSRTSPL 

TSPTNCSHGGLSPSRLWGWSADGLAKHPP 

PFSQPNSIPTAPSHKALRRSYSPSMLDYDTE 

xtt XTCT7T3TVCCT DnVTT? ATRlc'17QT7'D QOT7FIT XTR 

PDCRDGKKECDIVSRDGGAASPATEGRGGS 
EVEGGRTALDNKTSLLNTQPPRAFPGPRAR 
DYNPYPYSDAimTDKTAIJCEAWDDDME 
QLRDVPIDHSDLVADLLKELSNHNERVEER 
KGALLELLKTTREDSLGVWEEHFKTILLLL 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










LETLGDKDHSERALALRVLREILRNQPARF 

KNYAELTMKTLEAHKDSHKEWRAAEEA 

ASTLASSIHPEQCDCVLCPnQTADYPINLAA 

DCMQTKWERIAKESLLQLLVDHPGLLQGY 

DNTESSVRKASVFCLVAIYSVIGEDLKPHL 

AQLTGSKMKLLNLYIKRAQTTNSNSSSSSD 

VSTHS 


2746 


A 


153 


1224 


RVFSESVCSPYRNLEFLWRFAFPLAPAGRC 

PPGVPLQTSPRDTDAHRSSPLPPARASPGQ 

VAAAYRWARCPGCGGRKPRSSGSWQLCR 

CPTLPPPPRGSRSSGRC7RTWPSPPSCFPHFQ 

SGPRTTRAPTPSTI\PGYSGSYSSGPGR*GLS 

PLHAAAVSPPLPPGGP*GSWARAGLGSIASA 

HSPCPLCRSLIRSRS*QTCTRSPT*NCEVPPS 

AP * AASPLRTMFALVRTAGLK VHLLPLG Y 

CTmS*SSSMPQTVPVVVKVSNIPSVHPP*P 

CC^CTISRSRSIFTRSPICNPPGFLLPFCSPS 

TGQ*SL*KEPPLASWTHFRSDVLLLFSVSM 

NGSTLSLGCPSQKAVIALVQVT 


2747 


A 


1 


996 


MKIHSCAFVIEQEEKKKTEAHKEGDGVKR 

ADKILGVTKDPGTIAGLNVVRIINEPTAASI 

AYGTDKKFGAERHVLIYDLRDEIFDVSVLT 

LEDEIFEIKSTAGDTHLGEEDFDNQMINHFI 

AEFKYKHKDSRADIYTSITHAQFEELNAVL 

FRGTQDPIEIALQDTKLDKLQIHVIVLTQTF 

TTYPDNQPDVLIQVYEGESAITKDNNLLVI 

QGKFELTGILPAPFAVPQIKVTCDIDVNSSL 

MSAVGKSTEKENKinTNDQGHLSKEDIEN 

MVQEAEYKAEDEKQKNKVASKNSLDSYA 

FhMKATEKXQGKINNKDKOKILDKCNKIIN 


2748 


A 


73 


1210 


IPPPSSPSSPAAAPRAQLGKDALSPLALLLR 

PRRAYPRPLPTSESLAWGSPPPSRFGPSPAS 

QPRSPRLSFLVLGVACSAILMYIFCTDCWLI 

AVLYFTWLVFDWNTPKKGGRRSQWVRN 

WAVmYFRDYrTIQLVKTHNLLTTRNYIF 

GYHPHGIMGLGAFCNFSTEATEVSKKFPGI 

RPYLATLAGNFRMPVLREYLMSGGICPVS 

RDTIDYLLSKNGSGNAmVVGGAAESLSSM 

PGKNAVTLRNRKGFVKLALRHGADLVPIY 

SFGENEVYKQVTFEEGSWGRWVQKKFQK 

YIGFAPCIFHGRGLFSSDTWGLVPYSKPrTT 

WGEPITIPKLEHPTQQDIDLYHTMYMEAL 

VKLFDKHKTKFGLPETEVLEVN 


2749 


A 


351 


205 


DLYSEKASADHEGAEQFTDEFAKVIADGN 
LMPEQVYNAVKTSLFWCMVP 


2750 


A 


172 


2 


MLEQASLWLGRSFLLAGFLVSSSCPSLEQA 
AKGEGCSPIPCFAHCLDSLVRNFLCHP 


2751 


A 


2 


1410 


GPLIDLCKGPHETHTGKIKTIQIFTNSSTYW 
EGNPEMETLQRIYGISFPDNKMMRDWEKF 
QEEAKbniDHRKIGKEQELFFFHDLSPGSCF 
FLPRGAFIYNTLTDFIREEYHKRDFTEVLSP 
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Table 8 



SEQ 
ID 

NO- 


Method 


Predicted 
beginning 

n n pi Anti t\ p 

U Ui-ltU uu c 

location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deIetion,=possible nucleotide insertion) 










NMYNSKLWEASGHWQHYSENMFTFEIEK 

DTFALKPMNCPGHCLMFAHRPRSWREMPI 

RFADFGVLHKNELSGTLSGLTRVRRFQQD 

DAHffCTVEQIEEEDCGCLQFLQSVYSTFGF 

SFQLNLSTRPENFLGEIEMWNEAEKQLQNS 

LMDFGEPWKMNPGDGAFYGPKIDIKIKDAI 

GRYHQCATIQLDFQLPIRFNLTYVSKDGDD 

KKRPVIIHRAILGSVERMLAILSENYGGKWP 

FWLSPRQVMVIPVGPTCEKYALQVSSEFFE 

EGFMADVDLDHSCTLNKKIRNAQLAQYNF 

ELWGEKEKIDNAVNVRTRDNKJHGEILVT 

SAIDKLKNLRKTRTLNAEEAF 


2752 


A 


319 


495 


MVASFRESRVLLLGLWRVLTFDFLTQW 

RVGSECGDELVRLYSFTDEKANYLQQGGC 

R 


2753 

■ 


A 


23 


1255 


LRSIYTTHYRESVPKA/HLTDSFPDLLGLAA 

ED*HCPIALEAL*TITDAELRVTLTVEGKPV 

PFLINTEATHSTLPSFQGPVSLASITWGIDG 

\QA\SKPLKTPQ\LWCQH*TERRFKHSFLVIP\ 

TCQVPLLGVEDTLTKLSASLTIPGLQLYLIAT 

LLPNPJCPPLCPPLV/SPQLNPQV*DISTPSLT 

TDS 


2754 


A 


277 


467 


GLGPHDYLYSILSIERSCCC*CCCCCCRRRR 

CCCCaCV*GCSRFLCSIAESTPSGALRRLR 

GGR 




A 


oo 


593 


ASALLFWGFAESLREFTADCPPYKCPVAP 

EPLPQPLSVPLQCPGEESTDSPFSLPTVQPVK 

SRCSPFIEESPRANRSIPAFGSHLECASCSSR 

SFHGPPPCCLWGLPLSAPSPHVLHPPASAAI 

GPACCVTSLCPGAPQAQRPRKVDQTSSAP 

GAGPGTQDGNERPNP 


2756 


A 


3 


3617 


YWKERPTQKVIPRATENHGLKSYLQKTKL 

SIDEAAFLLPDTNLKSELLELLTHWLQVGV 

PMTPSLGSI^LGWLTELRETHTYICWFIV 

KETTRDTDEEMCRTEPALACSISHYCDDGC 

IQMLNTPETLQCSAKDSKHFIPKECSIPGEN 

PJPPSDTGKTVKFLSLNIFNLQLAESTDAEQ 

RANCILRCFLTETTLNYQKILSVRPGTKLAT 

ASHVSGLGLQTPPFGLAQHLIRPHAFLAPK 

DPLTSFTERNSRSGKTRCRSKKCAMRVVK 

SYSAILPKKRESVLTKTLLVAPTNEQTDPV 

LRMCCGKTGLKKGAGFTLESRGQRRMRA 

GCPTLCVRARVTbTDPSICSEVTFSWMILM 

LMDVCQCLGIEEFGIYCSLRSLDLFVPIFLE 

KVFQVFEGTSSPIMLWFLQTHRGTTLVALD 

KIQKNSLDYQAETLVLFPYFLPNKWNLSVF 

AEPPGTGDWMQAPLWPPPLGLYWALEH 

YDQHVAKPARQRSLSLWPPPPTAHKGFLQ 

GHCQCSLKTQRLFSQLMANAARPETQASG 

QWTPFSPGQIQKCSPRSRNALGTPRACLLL 

YPTVAELGSTEFNVKPSICCTLPYQGAQSPS 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 

nurlpfitide 

UUi>lEU UUv 

location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possibie nucleotide 
deletion,=possible nucleotide insertion) 










LHTLQLRGNGVGGQHQQFKTVSLDPFNAS 

FRDMKLKLGKSGISSWFVSIAAAVGDEGL 

VPRSMELYSQKAYDCLCCVMQWRKVGE 

SWQSQTSPSSHTTQKANLTSTLPPTTALSV 

FPGSGYQEWGTAVKILESMEATLEQDNKT 

RLEQFGGFRRKEDRKMWESLELPRDLWN 

DFDQNADSDMDNEVQAEWSDGDKELVR 

NWSKVWKGNVGLEPRYRVPTGALTSRW 

RRGPPSFRPQKCRSTDSLHHEPGKAAGTQC 

QPVKDLPKAVGAHSLHQPALDFRQEYLNP 

FSKNAKFQYECGNYSGAAENFYFFKGLVP 

ATDRNALSSLWGKLASEILMQNWDAAME 

DLTRLKETIDNNDEKPSFTHWGKERYLN 

AIQTMCPQFFRY/L*LTAVHNKQGIVRKRR 

PRV*KI*LSFIKQE\SYTYKRPNLQNLLECL\ 

YVNFDFDGGSRKS*GECEPGLV\NDFFLGG 

*S*GFQ*KMPPJLFIFETF\CRIPPSVSAIN\ML 

AD\KLNMTPEEAERVDW*NLIRKWQAWM 

PQDLIPKLGSCGLWGNNAV\SPLQQVIEKT\ 

KSLSFRSPDVGP*IMRKNLNQNSRSE\AP*R 

GQLQDSGLLIJO^HKEKMKKKNYQRKMK 


2757 


A 


1 


3090 


MHKELPALAACGLVADFDPVGEEETADFG 

PLVLDSDSDDSVDRDIEEAIQEYLKVGSSK 

DQGSASPVSMSRADSFEQSIRAEIEQFLNEK 

RQHETQKCDGSVEKKPDTHENSAKSLSKS 

HQEPATKWHRQGLMGVQKEFAFCRPPVR 

LAK1WQPRSLRSKVTTTTTQEKEGSTKPA 

TP/TRPSEAVQNKSGIKRSASTARRGKRVTS 

AVQAPEASDSSSDDGIEEAIQLYQVQKTHK 

EADGDPPQRVQLQEERAPAPPAHSTSSATK 

SALPETHRKTPSKKKPVPTKTTDPGPGDLD 

ADHSPKIPKETKAPPPTSPASRSKFVEWSSC 

QADTSAELI\AVLDIFKTILP/APMEGSDGSL 

SASPLFYSPNVPSRSDGDSSSVDSDDSIEQEI 

WTFLALKGTASEAPGGEGAARVPGDTRTS 

QGQGKTDEARHLDKKKSSEDKSSSLDSDK 

DLDTAIKDLL/RRVPGPSSQPWLLV*QQQFS 

GQRR*HRTGD*EVFGGKGQGVGSPRPGPA 

LSLEAHTCWRRRTATTGQAGRCVLCYDSQD 

PKCGDLKKPSKKRVKRKPYSTTKVTSGSTF 

NENIRRYAVHTNQCRRPHGSRVKKKRYP 

QEDDFHHTVFSNLERLDKLQPTLEASEESL 

VHKDRGDGERPVNVRWQVAPLRLESSKY 

TGITCQENNIJDAKKAPHEDTVHDrrNEDA 

THDIANEDTVHDIANEAADKGIANEDAAH 

GLASED AAHGIASEDAAQGIASEDAAQGIA 

SEDAAQGIAKEDAAQGIANEDAAQGIANE 

GAAQGIAKEDAAQGIAKEGAAHGIANEDA 

AQGIANEDSAHGIASEDAAHGIAIANEDAI 

YDIANDWQGTLTRTLYTTSLMRTPYKAL 

VMRTLYMTSLTRTLYKPSLTRTLYTTSLM 
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Table 8 



SEQ 

ID 

inu: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 

conn on 4* a 
acl|UcllCc 


Predicted 
ending 

I1UL-1CUUUC 

location of 
last amino 
acid residue 
of peptide 

aCl| UCUVv 


Amino acid sequence (X=Unknown, *=Stop 

codon, /=possible nucleotide 

deletion. =Dossible nucleotide insertion) 










TAPYKTSPMRALYTl'LIMIPTRHANADTV 

HDIANEDSVYDIANEGAVYDIANDTVQGT 

LTRTLYTTSLMRTPYKASVMRTLYTTSLTR 

TPYKPSLTRTLYTTSLMTAPYKTSPMRALY 

TTLLNCPTRHANVDAVHDIANEDTV 


2758 


A 

- 


1 ! 


1026 


MTLGPLTNQRKEHLTNFKSVSTPSSESFEC 

FFSTDSSDLSPSPQAARRQAEPGACFKCWK 

SGHWAEECLQPRIPPKLHPICVGPHWKSDC 

PAHLAATPRAAGTLAQGSLTPSQIFLAEWL 

KTDTARSPQKPPGPSQTLWVTLTVEVAAT 

ALILLEALKITSYAPLTLYSSHNFQNLFSSS 

HLTHILSAPKTLOLYSLFVESSTinVAGPDF 

NPASHIIPDTTPDPHDCISLIHLTFIPFPHISFF 

PVPHPDHTWFIDGSSTRPNRHTPAKAGYAI 

VSSTFHEATALPPSTTSQQAKLIALTQALTL 

AKGLLVNIYTDSKYAIMQYHHAVIWAER 

NFLTT 


2759 


A 


1 


383 


TRKCGQLPRSVSLPSGPQPLPGSVRHPRPV 

LRRPLPRAQGSSSSFRPRPPFAPDTMDKFW 

WHAAWGLCLVPLSLAQIGECPPQPGQQDG 

CGVLSADPAAAPPAESALGDWSQVSCLRS 

ALGSGKQGW 


2760 


A 


1057 


1226 


ARPSRVEAQMLGARRAASWLWAPWFCPN 

EG*NQPGQHSETPSLQKVLKPGMW/HLL 

WSOLLGSLRWEDRLSPGD 


2761 


A 


349 


1 


NQTPFFFFFFGGTETTSTTLCSWGLLILLKY 
PEVA/ESASQRDPEWEAAVWRWLEGPGSA 
QPPSAPAKGQELDPWGQRPVPSPDDHVQ 
WPYTNAVLLEIQRFISVVKRTLTLDTLY 


2762 


C 


199 


531 


MTGIVAKQNSASVPLPARLVRPTVNRKLL 
GAGTGSLPRKEARRERFLDGDQDGDEGPR 
QPSMGIJPHKQVQNRAMAKVVITFAPTNA 
MQLARSPKTLNFMKHGEMESVLE 


2763 


A 


1 


1428 


MVNPTVFFDTEPLGRISFELFADKFPKTAG 

NFHALSTGEKGFGYKGSCFHRTVPGFMCQ 

GGDFTCHDGTGGKSIYREKFDDKNFIRKHT 

VSGELSMANAGPNANSSQFFICAAKTEWLD 

GKHVVFSKVKEGMNIVETMECFGSRNGKT 

KGAGLAGSHSQRWLAASVCGASQPSRLLS 

TACRQQKLQISGRSKGCSRKTSGLEDQGLT 

KDGTNNTQGIKLQLGEEEEHSPRPSSLVPV 

SQLKANGSSSASIACAEDGPARPVPGCQCQ 

NQGHHQNKRPRTSQLCQMPKTHLVVADA 

RPNISRVFFGLPERESALWSFPRDWLVNLL 

NOCDELGIRNOFEVEVLSYGHLPLAYSARC 

FTARSEDRPKDECETCCIKYPNGRNVLSQE 

NQQVFVO^GIQTMSGYVYNLGNELASMQ 

GLVDWRLSPQGTDTFAMLDAFRANENG 

AAPLPLTANSDCNGYWRRLADFECTW AH 

SQGGCHA 


2764 


B 


159 


2657 


MTCGTDGAITFWESLTGHRYIHKPTNPDEP 
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Table! 


3 


SEQ 

n> 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

UUUGUUUV 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possibIe nucleotide insertion) 










PVAEQPKPLYPYRTIGCVFNHQMFLGNCQ 

PSDAVETCVFDLNDESKWKPMSEEAIKSV 

CAPGATTSLPPFPPLCASTIDASVTSNEIEM 

QLRLLVSEHRKYTKIHTCPSPTGGPVEPAD 

TKSQPSVCMDFTSHEYPJSDPFLVEKNLPK 

EKTANTAGHQKEQTGDTLPLRNITGTVRV 

HGFELEVSETKNPPNPGHKTTSISQRPKALV 

SLGPEVRRGTRGEDEKALEKEGGGRRWEC 

GGANELCGRPPAFTRVTVHWGKGNDQTF 

QDLLDTGSELTLIPGDPKRHCGPPVKIGAY 

GGQimGVLAQVQLTVDAVGPWTHPVVFP 

SARMHNWNRHTQQLAESfflGSLTVHLSSD 

PKGCHSEWGPEQEKALQEVQAAVQAALIL 

EPYDPAGPWLEVSLADRDAVWSLWQAPI 

GESQQRPLGFWSKALPSSAAIKRVMHSSIP 

SSNGSGIYMIGI^QVRKAQIVLHDMQPPCE 

NGTASALQPLSRKSLKDSSEGKSSQWAEL 

RAVHLAVHVAWKEKWPDVRLDTDSWAV 

ANGLARWSGTWKEHDRKIGDKEVWGRGT 

RIELSEWSKTVTIFVSHCFYQDYHPSVGSQ 

NALYTNMVFHTALPLTKALTLRLKNCNSG 

LMLTEFTGLTMFPHOGWGKVLOKAVYAL 

■M-JXY 1 1 J X X^X X VJ M-J X ITU X 11WVJ TT \JXV T 1 f », 1 ' 

NQRPIYEWKEESCLHTGVADALRGNWAE 
GHREHKALWLGLWSTWSQHPLRSLKTTR 
HHPGLGVLSEDICEAGGATEELSRASGFAT 
GYGKRKEDTKKHKQHSVSDIM 


2765 


A 


3 


662 


TRIAETILKKKTKVGGTILSDFKMNKARVL 

EIVWYLWSNRCMNQWNRIEDPETDPQTN 

GALAIGHPQTKQIKLTNRPQSLNLNLRPDM 

KMNSKWIVDLNVKCEAIKTF/EKKTRENLH 

HQKHNIXDNIYKXNFKICSAKSAV/SRKK 

K7PTA*EKIFANRLSNIGLISREYKQLLKLSS 

*KTV*LENGGLAWWLTPVIPSLREAKVDEP 

LEARGSRPAYPTW 


2766 


A 


736 


927 


SVAHSSCVSHTHMHTLLGRPvATINCLFRN 

GRGQVQWLTSAVPALRKADVGG*LEPRSS 

RPAWAT 


2767 


A 


194 


3 


MVMLTIAIRLMQFEFRQFFIKVNFRMRGL 
SKMAMLLLCRARPYSYKKEEGWSVLSGY 
FLTAGNF 


2768 


A 


593 


230 


DFYLYPERKKRGQMMTAVSLTTRPQESVA 

FEDVAVYFTTKEWAIMG\PAERALYRDVM 

LENYGGCGPL*CHPTSKPALVFS\LEQGKES 

CFSPATGSSLSRNDWRAGWIGYLELRRYT 

YLS 


2769 


A 


3 


4804 


KRLENIQKTLEVAFSEAVWMQPSWLLDD 

LDL1AGLPAVPEHEHSPDAVQSQRLAHALN 

DMKEFISMGSLVALIATSQSQQSLHPLLVS 

AQGVHDDFQCVQHIQPPNQEQRCEILCNVIK 

NKLDCDINKFTDLDLQHVAKETGGFVARD 

FTVLVDRAfflSRLSRQSISTREKLVLTTLDF 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 
nucieouae 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

mini af\ t"t n a 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /^possible nucleotide 

deletion =r>ossible nucleotide insertion) 










QKAIJRGFLPAS1JISVNLHKPRDLGWDKIG 

GLHEVRQILMDTIQLPAKYPELFANLPIRQ 

RTGILLYGPPGTGKTLLAGVIARESRMNFIS 

VKGPELLSKYIGASEQAVRDEFIRAQAAKP 

CILFFDEFESIAPRRGHDNTGVTDRWNQL 

LTQLDGVEGLQGVYVLAATSRPDLIDPALL 

RPGRLDKCVYCPPPDQDGSSSSDSDLSLSS 

MVFLNHSSGSDDSAGDGECGLDQSLVSLE 

MSEBLPDESKFNMYRLYFGSSYESELGNGT 

SSDLEDESMNQPGPIKTRLAISQSHLMTAL 

GHTRPSISEDDWKNFAELYESFQNPKRRKN 

QSGTMFRPGQKFFDEITELTYLPSFHHKAA 

PHQAEPGPNSSSASAPPPYNPFITSSPHTQS 

GLQFRSVTSPPPSAQQFPLKEVAGAKGIVK 

TALETAPTLALPVSSQPFSLHTAEVQGCAV 

GILTQGPGPCPVAFLSKQLDLTVLGSPSCL 

HAVASAALILLEALKITNYAQLTLYSSHNF 

QNLFSFSHLTHILSAPRLLQLYSLFVESPTIT 

ILPGPDFNLASHIILDTTPDPDDCMSLIYLTF 

TPFPfflSFFSVPHVDHIWFTDGSSTRPDRHS 

PAKAGYAIESSTSnEATALPPSTTSQQAELI 

ALTRAPTLAKGLHVNIYTDSKYAFHILHHH 

AVTWAERGFLTTQGSSIINASLIKTLLKAAL 

LPKEAGVTHCKGHQKASDPITLGNAYADK 

DRTTOGSSQVffiEKNHNGYSVTOTGTLVEA 

ELEKLPNNWSPQTCELFALSQALKYLQNQ 

KTISILIQKEPSPALGLTPERKGNVGHAGKG 

PLESSSPDPFLCGQERREKGCRTATSVSITN 

PINRGPWWTHPGKELTPEHKGNVGHAGR 

DILAKAGAIIHLNIGEGTPVCCPLLEEGINPE 

VWATEGQYGRAKNARPVQVKLKDSTSFP 

YQRQYPLRPKAQQGLQKTVKDLKAQGLV 

KPCSNPCSTPELGVQKPNRQWR\TLCHQAT 

QALFNFLATCGYMVSKPKAQLCSQQ/RYL 

GLKLSKGTRALSEEfflQPILAYPHPKTLKQL 

RGFLGVIGFCRKWIPRYGEIARSLNTLIKET 

QKAKIHLVRWTTEVEVAFQALTQAPVLSL 

PTGQDFSSYVTEKTGIALGVLTQIRGMSLQ 

PVAYLTKEIDWAKWAVAVLVSEAVKnQ 

GRDLTVWTSHDVNGILTAKGDLWLSDNC 

LLKCQALLLEGPVLRLCTCATLNPATFLPD 

NEEKIKHNCQQVISQTYATRGDLLEVPLTD 

PDLNLYTDGSSFVEKGLRKVGYAWSDNG 

ILESNPLTPGTSAQLAELIALTWALELGEEK 

RANIYTDSKYAYLVIJ1AHAAIWKEREFLT 

SERTPIKHQEAIRKLLLAVQKPKEVAVLHC 

RGHQKGKEREffiENCQADIEAKRAARQDP 

PLEMLDCQPLV i 


2770 


A 


1 


2919 


MLLATALRGFLKNGDRGHVDTEEWRSYP 
WAASFGQLRSSQNCPGASASGRTGVPTVL 
VARTDADASDLITSDCDPYDSEFMTGERTS 
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Table 8 



SEQ 
ED 

1NU: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucieouuc 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /^possible nucleotide 

deletion =nncsible nucleotide insertion) 










EGFFRTHAGffiQAISRGLAYAPYADLVWCE 

TSTPDLELARRFAQAIHAKYPGKLLAYNCS 

PSFNWQKNLDDKTIASFQQQLSDMGYKFQ 

FITIJ^GffiSMWFNMFDLANAYAQGEGMK 

HYVEKVQQPEFAAAKDGYTFVSHQQEVG 

TGYFDKVTTIIQGGTPDKAFTPHPASKPAH 

KPGEQPMO^LISIYMPTWNRQQLAIRAI 

KSVLRQDYSNWEMIIVDDCSTSWEQLQQY 

VTALNDPRITYIHNDINSGACAVRNQAIML 

AQGEYTTGIDDDDEWTPNRLSVFLAHKQQ 

LVTHAFLYANDYVCQGEVYSQPASLPLYP 

KSPYSRPJJFYKROTGNQVFTWAWRFKECL 

FDTELKAAQDYDEFLRMWEYGEPWKVEE 

ATQILAINHGEMPfflSSREHFRVLPFCRSTR 

PFRQARKISRVTVTSTKSDSLYTVGMLALS 

VRAIRCPLYLLTGLISVSKNGLWYCELQVA 

LHGRSVTLYEKAFPLSEQCSKKAHDQFLA 

DLASELPSNTTPLIVSDAGFKVPWYKSVEK 

LGWYWLSRRMQIEETFRDLKSPAYGLGLR 

HSRTSSSERFDIMLLIALMLQLTCWLAGVH 

AQKQGLDLGVYGAPETFUDGNGIIRYRHA 

GDLNPRVWEEEDCPLWEKYTLATIDVLQF 

KDEAQEQQFRQLTEELRCPKCQNNSIADSN 

SMIATDLRQKVYELMQEGKSKKEIVDYMV 

ARYGNFVTYDPPLTPLTVLLWVLPVVAIGI 

GGWVIYARSRRRVRWPEAFPEOSVPEGK 

RAGYVVYLPGIWALIVAGVSYYQTGNYQ 

QVKIWQQATAQAPALLDRALDPKADPLNE 

EEMSRLALGMRTQLQKNPGDffiGWIMLGR 

VGMALGNASIATDCYATGYRLDRTTVML 

DGDR 


2771 


B 


1 


1773 


MALGISAPVALQGTAPLLAVLSGCSFPKH 

MLQTVNGSPFWGLENGGPLLRARLGSAPV 

ETLELFSSLNKILHSYHSSWKCDLILLGRW 

TKAWDPLSAGGGCHTGPLPLQVEGNHPTG 

SYRVPNRPQYRSVAWGLGTSGLVNYTFLL 

NSGFHTYQFLRGNKDFLKNHIKLNYCFLLI 

EVDNLTLVFVIEKTLGQIFDIPKVELLFSYQ 

CFPMVENRQKPEGEEDCVIQLSELSCTECS 

KXAWRMEVLHTNKTTNATQCGGPAQLQQ 

FNAVLSEKVHTVPSLLRSWNnSHGRFPSFE 

TFNTKNCIAYNPNGNALDESCEDKNRYIW 

LEKPQETYShTORRESKHIPLRMAAERRRAE 

QKEKYPLIKSSDLGASEAIRQRQSSAAKLR 

KSGKESVREPWARVPGALGVAARALIAED 

AGLSRVILFHYGESWNLLRADQRLIFAKS 

WPRASRYQQGHQDLFILRSDLPSQVFIRDK 

IJsffiRRNRRTGRTEKAPJWEVTDRTVRTWI 

GEAVAAAAADGVTFSVPVTPHTFRHSYAM 

HMLYAGIPLKVLQSLMGHKSISSTEVYTKV 

FALDVAARHRVOFAMPESDAVAMLKQLS 
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Table S 


> 


SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
de!etion,=possible nucleotide insertion) 


2772 


C 


148 


306 


MRPCCWWATLCGKHLRMCSHALKMRPN 
ASAAETEQLNAHSRGLMNSSSRPAP* 


2773 


A 


2874 


3062 


GNRAGALPGATLLILAGFLPSAHQNRPSRN 
P VSRPPNTQRVARRKHY ALADGYTERRWT 
NAP 


2774 


A 


1 


660 


MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPV 
AQYPTIAPPAVTISASYPGADAKTVQDTVT 
QVIEQNMNGIDNLMYMSSNSDSTGTVQrr 

SVEKSSSSFLMWG\ONTDGTMTQEDISDY 
VAANMKDAISRTSGVGDVQLFGSQYAMRI 
WMNPNELNKVERNSRRODVGERDISSGSR 
KVNKESREDEEVT 


2775 


A 


78 


264 


PVEPvSNLGVRLYACCGLLLRPAYPQHFAH 
GYVDKIPDYPRRAGTLTGLHPMQVCRCRR 
AREL 


2776 


B 


1 


921 


MLDDYGGSLSELAREQLPAAEQAALAQLA 

ARSLAPVPDDTGGAGMSNDTPFDALWQR 

MLARGWTPVSESRLDDWLTQAPDGWLL 

SSDPKRTPEVSDNPVMIGELLREFPDYTWQ 

VAIADLEQSGRIGDRFGVFRFPATLVFTGG 

NYRGVLNGEHPLAELINLMRWLVEPQQEL 

HOPLTTVONANDCCCDGACSSTPTLSENV 

SGTRYSWKVSGMDCAACARKVENAVRQL 

AGVNQVQVLFATEKLWDADNDIRAQVES 

ALOKAGYSLRDEOAAEEPOASRLKENLPLI 

TLIDSSYFPHGTELAF 


2777 


A 


47 


275 


FPCPPAPHVCGPPPCPRAFPVGQSSSQPQV 
A TfiFP *SP VCPPPRT .YWCiPGTERHWVETH 

A A VJA A OA V VsX A A XVJLr A TT VJA \J A AwAVAX T » V Xj X XX 

YRAFLPSQHLSSPVTAA 


2778 


A 


*7 /in 

749 


1020 


VLVRDPSQPAQPFSVSFSPQKHRDEKLYFL 
PKGVSGGSELRGRPOPYLPCPVSPTLCPWG 
HLSLAPPSVPPTACESSSELWPSLSWTWAE 


2779 


A 


271 


86 


MPLHTCLVHVGVSHAARGSPVCPSVLWV 
WFCVHFQVfflMWAHECVQADWAHIQD 
CAQVCV* . 


2780 


A 


3 




AAANRKRAAYYSAAGPRPGADRHGRYQL 
EDESAHLDEMPLMMSEEGFENEESDYHTL 

| . i fJJU * * I LJ U r J '1"' 1 JL/iTXAT±LJUJU\JJi X^X 11 <l XXXXXrf 

PRAJRIMQRKRGLEWFVCDGWKFLCTSCCG 
WLIMCRRKKELKARTVWLGCPEKCEEKH 
PRNS1KNQKYNVF1F1PGVLYEQFKFFLNL 
YFAVISCSQFVTALKIGYLYTYWAPLGF 


2781 


A 


2 


141 


EQFLRRQIASEKEEIERLKAEIAEIQSRQQH 
GRSETEEYSSLLLQF 


2782 


A 


3 


402 


GNGGFVVHWLNNKEFHFTSSTEVFMHOLR 

KLSDKQVDHENDDADREDEEHSQEDRER 

GLHMKLDHDLSLDRESEAGTGSSEHEDGE 

REGSPRTYSRI^VPMPU > TVLLDRKIETLLT 

EWNKNPDMLFTIHPMY 


2783 


A 


333 


695 


ISVFRSPGQSTSQHDAATWPFLHISGEGPTP 
SRRKAPPAFHPHTQACPSTCY CHTLASRRG 
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Table 8 




SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










PCNGRYHRPVYPHPTAMQRDPPAGPRGCQ 
SPCVraYTPAOlHPCGRHYR*HGQHDPPPW 

^ ilL/ rvJOr Oylj JL o\<irLL//\ V 1 Wx^x-JrxxJTVjx^xv. 

PTASRRKAPPAFHPHTQACPSACY CHTLAS 
RRGPSNGRYHRPGYPHPTAVQRDPPAGPR 

PWQ 


2784 


A 


91 


297 


MSLVKIJmLWSYRRGAVITIKIEVKIKVT 
YVKCQAHGERLINGHYDYSACHVIKLMFC 
AEEKKPHQ* 


2785 


A 


2 


103 


TGEKWPGEVNPPNGPVGDPLSLLFGDVTS 

JLJvor L/oL 1 OL/OJJlIAIlv^JL^JVJJ-yoJVl 1 ivoivxAoVJ 

GQRANRDGTKRSSCLVTYQGGGEEMALP 

DDDDEEEEEEEEVELEEEEEEVKEEEEDDD 

LEYL*EGSTRRGKPTQWPCGGPTEPLVWG 


2786 


A 


24 


332 


nPOVTAPT MAMFnP^V^PN^TVRYFDNGT 

yi y I lATLlVl/VlirL/rO V OXUNO 1 V X\. X X JL/i^i VJ i 

ALWQWDHVHLQDNYNLGSFTFQATLLM 
DGRIIFGYKEIPVLVTQISSTNHPVKVGLSD 
AFVWHRIQQIPST 


2787 


A 


210 


281 


FHHKQLHNPVLECHQPAGPCHYL 


2788 


A 


2 


1211 


WTPPGAPGAKGPRQGGCCSGLLRPPRVSG 

KTCGARPPWPWRSLSRIPKREGLGEEDTA 

VAGHELLLPNERSFQNAAKSNNLDLMEKL 

FEKKVNINVVNNMNRTALHFAVGRNHLS 

AVDFLLKHKARVDVADKTRMRELLLEIFL 

TWRAQFHDLHCLESKLEDCEMRDTLRHM 

QAVYRETMLTHTVTCVRLGALSYLKTMA 

CRPQQMLSDKNMDSVLTSYMNLGKLHNL 

SVLQFLYLKNEDKNSTYVNLELSEPJPTLIF 

/^Tni r Pii r vpTt\/TiyroT A oa/tt wt at tt fwtv 

V^lV^isJr I IvtS V IVlv^i^AV^lVLU V V J_>/VL» 1 l^r or 1 V 

VVLNSHIAMWSERIFKAIGDLI^PJCIHIHIY 
DKNIAYESAVPIMPVIPQTGSPTYTSSAALP 
QCLTPGNTTHSVATVNGSSWSSALRSQCDH 


2789 


A 


1 


334 


FRANRTVKDAHSIHGTNPQYLVEKIIRTPJY 
ESKYWKEECFGLTAELWDKAMELRFVG 
GVYGGNIKPTPFLCLTLKMLQIQPEKDIIVE 
F1KNEDFK*VQCSLANIRGMY 


2790 


A 


3 


1794 


AMLPMELGCGPLPEPLPVGCSRFSLFK*QT 

CISTVP/GYMVTAQSMSSTPPPPSPSTLPSSP 

SPPPPLPQPLPPPPPSPPTLSSLSSPSPPRPPL 

VSPSTLPSPQPSSPQPLLPPSSSPPSLPSPPPP 

SPPLPSPSPSAIPSLPPPSPQPLPPPPPSSPPPS 

LPSPLLPPPPLSSSPSSPLSPSPPPPSPPPSLPP 

SPPPSPPPPPPPQPPSPPSSPLSSPPLSSSQPSL 

LPPSLSSLPLPSSPSPLLPLSLPLSISPP*LSLL 

SPLPPSPSIPPSSFQST^GQCFSIVVMWHV 

APCTYIALAGNTLMAWPLMSASSKASGG 

VSMFVWRNVEPCSVAVFSWYSVPFLTPPC 

SRVRPSNLPVTQWPPTRAKNLPSRQLLLTS 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X^Unknown, *=Stop 
codon, /=possible nucleotide 
ueieuon^uosMuic iiutieuuuc iusvi uvuj 










VHQAQSLSALCKEQDSSSEKDGRSPNKWD 
KDHIWWPMSGGHDLQQAAPGPGRAHQGH 

LAFQHMAGEDIASDEEHMVIHEEEGVMVS 
LLMTALAPLTLISSSRIFGKVYGPTPSSSYT 
YSDASSSTLAPTSFLLGPGAFKAQESGEEA 
EDGLRELETEKALSSSL/RRALDQ/*LALIM 
OLFQAHCFFLST 


2791 


A 


230 


2579 


AICDPCYWRMEKSPRMMEKKLSKGNOPD 

WESRWENKELSTKKDNYDEDSPQTVIIEK 

WKQSYEFSNSKKNLEYIEBCLEGKHGSQV 

DHFRPAILTSRESPTADSVYKYNIFRSTFHS 

KSTLSEPQKISAEGNSHKYDILKKNLPKKS 

VKNEKVNGGKKXLNSNKSGAAFSQGKSL 

TLPQTCNREKIYTCSECGKAFGKQSILNRH 

WPJHTGEKPYECRECGKTFSHGSSLTRHLI 

SHSGEKPYKCIECGKAFSHVSSLTNHQSTH 

TGEKPYECMNCGKSFSRVSHLIEHLPJHTQ 

EKLYECPJCGKAFEHPvSSLIHHQKIHTGEKP 

YECRECGKAFCCSSHLTRHQPJHTMEKQY 

ECNKCLKVFSSLSFLVQHQSIHTEEKPFECQ 

KCRKSFNQLESLNNmiJR^IHIRLKCDFYLM 

NAIYVGKPLVIGHPCFNITEFILERNLTNVL 

NVGRPSAWQTLPYIREFILEKSHINWSVG 

KLLAKAQILLPIKEYIMERNPIVWEPLQPW 

SRQALGHQAGESRGHTQRCKVTRLSSWQ 

VLVGAAVPCSGARDRVPVPRHVPQACLQG 

RVQTGRLDWRGHACSASPNAVPTVTFSDV 

AIDFSHEEWACLDSAQRDLYKDVMVQNY 

JCJNirf V o V VJl-rOl 1 JSJT I V 1 X XjJ_<xiXiVjrwxix rv xvx » x^xv 

KLSKGMIPVLEVLARAMRQKNEIKGIQLG 
KEEVKLSLFADDMIVYLENPIVSAQNLLKJL 
ISNFSKVSE1PKSMYKNHKAFLYTNNRQTE 
^OTMSFT PFTTASKRIKYLGIOLTRDVKDLF 
KHNYKPCSTK 


2792 


A 


154 


331 


IPAAATCMGSLLGG*ETPGLWARRSVKSR 
GT FPGT.PSPSRASVRSLLLLPAWAAFLEGIV 
DTRPTAWRAFPWTLFLSVFCQFLDFPETSL 
DSQKLSLDTPSF 


2793 


A 


213 




TT T ORSI GVGGHRAWGIOEPSKVLVSGRRT 
EAPSMLQMGRQMWGRTSWRWTRTWRCG 
WPWGGPIJWUIIWSSCTKQGH 


2794 


A 


515 


278 


IFTIJTDK1JSSQIPSIIJISQYQSCLYDPS 

PPTSDAHDHKHGPH^ 

RSIQYISARPQLKGPF 




A 

A 


1 


708 


VTAGVPKGHCPRRGTSSAIASCPPYGSPPR 

AECALRAGSTVTT*RRSCCTSYSSGRPPTG 

RRGSWTLVCTSCCASWRRACSRRSSTSSSS 

ATARCLRPWDSLRPCSGPSPSTSSGPSSCSE 

AITWHrvm > SMRCSRMPQRSLASn>YMS/S 

SDQPTPK5*RLLQNVGSSS*DEG1PHVHTPG 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *-Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










GICQPCSGDKAGFRGSRAQPARKPSPTVQR 
KONFNGKLVCFIPLGSAGKAVrWV 


2796 


A 


2 


590 


FQGRGLAANDGEYLKLQWRAGTLVLAPS 

CPLSTLSVLSSPPRELQAMEALQNGQTTVE 

GSffiGQSAGAASHAMIEKILSEEPRWQETA 

YVLGNYKTEPCKKPPRLCRQGYACPYYHN 

SKDRRRSPRKHKYRSLG/TQEASHGREEW 

QGRGQAEAAPTGSPGGGEAGPGDDRIASP 

GPRGGHSEDSWTVGAQLHLLHE 


2797 


A 


319 


513 


IELRAVAQGIAQSLGQLLFTQCPLEKKDLE 

GLFLQNNKEGVQKGRDEPLPPLP*ATALSS 

IQAGIQQAR*EGDLEAWQFPVRIHPPDQQG 

NITVTFEPFPFKLFKEFKQAVNQYGPGSPFV 

MGLLKNVAVSSWMIPTDWDALTRACLTP 

AOFLQFKTWWADEAGRV 


2798 


A 


1 


915 


MSTAVWKWLCTVAPGRGSAPSLSSCLD 

WKVNGAEGSHNKDLFVLTYGALVAQLCK 

DYEKDEDVNQYLDKMGYGIGTRLVEDFL 

ARSCVGRCHSYSEIDDIIAQDMERGFCALHI 

DTEGRYEWWTTSTQLQSTLPRAAQCSVYQ 

KQPDRKSLTVGQKIEVGNPGIGTEQSPQGL 

VRFATQAFLTTHRAEGLQQSQVKGSVIHL 

KSQDKCGEHRFTTNQVETGDPVRESSSQH 

SVGRGGPKDIQIQGANVPVRQCNLLWRITL 

GPLETPHLEFSGECSLLAAMEAPEHTWDQ 

EKSDIPEPPHRSS 


2799 


A 


75 


642 


EKLLNPQTTSFFLQLLQKKQWYPKSFPCCL 

PSQGLLPAARVQKCLLVLRNVSGSPFPFLI | 

GFPPPILELKESYP\WAGTDIQCEPAQGHVL 

TSPSPTLR\LQGAPDLPAGEPAWLLLTAREE 

DDG*NFSC*ASLWQGQRLMKTTVIQLHIL 

CEWRPDLSCQNKDYYFPISRELLGQQCFnT 

VATFFSL 


2800 


A 


1 


1146 


MVGECGTKLEVMQVHLSNPRDELEGELRS 

IRVTMGQVWALVHSTLEPFHTNEEEEGLY 

NKVTEEVTEQVCLPAKAKAAKEGEVHPYP 

SPFPHYFEETEWPDPPDLSFLEDTGGDPSLT 

SHWQLTKEAEAELQLIEKQVHKAQINRIDP 

EKIPDLLIFSTQHSPTGVIVQEQDLVEWLFL 

PHTNSWTLTPYLDQNATMIGNERTQIVKL 

HGYDPRKITVLLMKANIQQAFINGLTWQTH 

LANFVVILDNHFPKMKLFQFLKLTNWILPK 

ITKFKPIKGAENVFTDGSSNGKASYSGSKG 

LSQQLIWISSRNLKPYHESDAEEEIPGRTQG 

TPGCSHVETDTEEDPNCHEQHPLNTATHL 

GTDOEAVTDGGRKPEERGTTSHNE 


2801 


A 


2 


926 


RPEPSCRPRSEYQPSDAPl'liKJiiyx^KJJfK 
AWPLPRRGDHPWIPKPVQISAASQASAPIL 
GAPKRRPQSQERWPVQAAAEAREQEAAP 
GGAGGLAAGKASGADERDTRRKAGPAW 
MVRRAEGLGHEQTPLPAAQAQVQATGPE 
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Table 8 



SEQ 

ID 

NO- 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possib!e nucleotide 
deletion,=possible nucleotide insertion) 










AGRGRAAADALNRQIREEVASAVSSSYRN 

EFRAWTDIKPVKPIKAKPQYKPPDDKMVH 

ETSYSAQFKGEASKPTTADNKVIDRRRIRS 

LYSEPFKEPPKVEKPSVQSSKPKKTSASHK 

PTRKAKDKQAVSGQAAKKKSAEGPSTTKP 

DDKEQSKEMNNKLAEAKE 


2802 


A 


25 


435 


TKYWLLLFFLILILPFFFWRRSRSVTQAGG 

QWHDLGSLQPPPPGFKQFSCLSLPSSWDYR 

RAPLHLANFYIFSRD/MDFTMLAPvLVSNSR 

SQ/CDPLASASQSAGISGKSQHTRPVLVLLK 

TYTNSH/SF*VKGLGWEFIL 


2803 


A 


1186 


1074 


TAAARRSSRTSSHRSLLHVPENLATGPSEF 

RSPGFLLSRVPSVWDPTENRTVQLTWQPLP 

EPLELWPKA/HLTDSFPDLLGLAAED*HCPI 

ASEAP *TITDAELRVTLTVEGKPFPFLINTE 

ATHSTLPSFQGPVSLASrrWGIDGQASKPL 

KTPQLWCQLRQYSFKHSFLVIPTCPVPVLG 

♦DTLTKLSASLTIPGLQLYLIAALLPNPKPPL 

RPPLVSPDLNPQV*DPHSCPPENKPPLTVIF 

LYLPKSYKTAPPHLPLLTLFSDSARLHPGEI 

NSHVAHTKPVWWSLHTDAHEIWCRHSDR 

GTSLGRSPCPPALCSMRKIHLRPQVLRQTS 

PRNISPISNPVSGLFLLSSPTCLTIPQPLSPFN 

LGATLQSLPSLNFNSFHFLVETKETRFICGP 

KTP ALVTD WEGSLPLMFNHCRDTSLIIHPC 

FQGVRPCRDACLSPSPLAASPAFLGKGQVP 

LNPFFTLSGKSRFSGGGASTPTPSFHVSTPS 

LLFWGRGKYPSTPSSPLVASPAFLGKGQVP 

LNPFSFTLSGKSHFPGTGARFN 


2804 


A 


3 


810 


GVSPCWPGWSRTPDFGSNPKCPPIRASPGA 

ELQALSSTVTTPYWGILVTAVFPH*GLRPR 

QCRQDHPAGRQGPGPGEVPEILGQSGCTD 

RTWSKAGGRTQAPGPRSRAGRRVSGQEIR 

APGPLGCRHGG/V GAP WTPEAASPLTATEP 

SCPH/LQAPCGYMPLSVSPRRRYRGPAGDQ 

KVKMLKFKAFCLDYWQFLCLQPLHGAYK 

RDSDUvOWTWGLLPEVTGAAGTTSPNVHT 

SGRFFRACVFCPVHTLVKKEPHPGQQEIIM 

EPSPWSP 


2805 


A 


62 


475 


FEPLFYLMCLLNLFPLQLPRHPFLFLTVDLV 

NTWGCPLPSSPQ*EWLLAAPHRSTPPPLSS 

GFPARRQLEPGAGARGP/HHTQALHLSFFF 

VFLRRSL/DSVAQAGVQWRGLGSLQPLPPG 

FVMLSSPLSLPSLTY 


2806 


A 


3 


4804 


KRLENIQKTLEVAFSEAVWMQPSWLLDD 

LDLIAGLPAVPEHEHSPDAVQSQRLAHALN 

DMIKEFISMGSLVALIATSQSQQSLHPLLVS 

AQGVHEFQCVQfflQPPNQEQRCEDLCNVIK 

NKIJDCDINKFTDLDLQHVAKETGGFVARD 

FTVLVDRAfflSRLSRQSISTREKLVLTTLDF 

QKALRGFLPASLRSVNLHKPRDLGWDKIG 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

llUUCUUUv 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










GLHEVRQILMDTIQLPAKYPELFANLPERQ 

RTGILLYGPPGTGKTLLAGVIARESRMNFIS 

VKGPELLSKYIGASEQAVRDEFIRAQAAKP 

CELFFDEFESIAPRRGHDNTGVTDRWNQL 

LTQLDGVEGLQGVYVLAATSRPDLIDPALL 

RPGRLDKCVYCPPPDQDGSSSSDSDLSLSS 

MVFLNHSSGSDDSAGDGECGLDQSLVSLE 

MSEILPDESKFNMYRX.YFGSSYESELGNGT 

SSDLEDESMNQPGPIKTRLAISQSHLMTAL 

GHTRPSISEDDWKNFAELYESFQNPKRRKN 

QSGTMFRPGQKFFDEITELTYLPSFHHKAA 

PHQAEPGPNSSSASAPPPYNPFITSSPHTQS 

GLQFRSVTSPPPSAQQFPLKEVAGAKGIVK 

TALETAPTLALPVSSQPFSLHTAEVQGCAV 

GILTQGPGPCPVAFLSKQLDLTVLGSPSCL 

HAVAS AALILLEALKTTNY AQLTLYSSHNF 

QNLFSFSHLTmLSAPPvLLQLYSLFVESPnT 

ILPGPDFNLASHnLDTTPDPDDCMSLIYLTF 

TPFTHISFFSVPHVDHrWFTDGSSTRPDRHS 

PAKAGYAIESSTSIIEATALPPSTTSQQAELI 

ALTRAFTLAKGLHVNIYTDSKYAFHILHHH 

AVrWAERGFLTTQGSSHNASLIKTLLKAAL 

LPBCEAGVTHCKGHQKASDPITLGNAYADK 

DRTIDGSSQVIEEKNHNGYSVIDTGTLVEA 

ELEKLPNNWSPQTCELFALSQALKYLQNQ 

KTISILIQKEPSPALGLTPERKGNVGHAGKG 

PLESSSPDPFLCGQERREKGCRTATSVSITN 

PINRGPWWTHPGKELTPEHKGNVGHAGR 

DILAKAGAIIHLNIGEGTPVCCPLLEEGINPE 

VWATEGQYGRAKNARPVQVKLKDSTSFP 

YQRQYPLRPKAQQGLQKTVKDLKAQGLV 

KPCSNPCSTPELGVQKPNRQWR\TLCHQAT 

QALFNFLATCGYMVSKPKAQLCSQQ/RYL 

GLKLSKGTRALSEEHIQPILAYPHPKTLKQL 

RGFLGVIGFCRKWIPRYGEIARSLNTLIKET 

QKANTHLVRWTTEVEVAFQALTQAPVLSL 

PTGQDFSSYVTEKTGIALGVLTQIRGMSLQ 

PVAYLTKEroWAKWAVAVLVSEAVRHQ 

GRDLTVWTSHDVNGILTAKGDLWLSDNC 

LLKCQALLLEGPVLRLCTCATLNPATFLPD 

NEEKIKHNCQQVISQTYATRGDLLEVPLTD 

PDLNLYTDGSSFVEKGLRKVGYAWSDNG 

ILESNPLTPGTSAQLAELIALTWALELGEEK 

RANIYTDSKYAYLVLHAHAAIWKEREFLT 

SERTPKHQEAIRKLLLAVQKPKEVAVLHC 

RGHQKGKEREIEENCQADIEAKRAARQDP 

PLEMLIKQPLV 


2807 


A 


1 


591 


MTPRGTGGDSEVPFQAAKPLSVKQGVSFR 
LWARRRPRCDFLRSSRIRVHPTPAASTMPP 
KFDPNEIKWYLRCTGGEVGATSALAPKIG 
PLCLSPKKNRQAQIEWPSASALIIKALKEP 



WO 03/080795 PCT/US02/25485 

493 



Table 8 



SEO 
ID 


IVf ethnd 

ITXC til V/iX 


Predicted 
beginning 

tiiiaI onfiflp 
11 UCICU uuc 

location of 
first amino 

a oiH i*AGidiiP 
AlIU icoiuuc 


Predicted 
ending 

nnrlefitide 

location of 
last amino 

arid residue 


Amino acid sequence (X=TJnknown, *=Stop 
codon, /=possible nucleotide 
deIetion,=possible nucleotide insertion) 






of peptide 
sequence 


of peptide 
sequence 












PRDRKKQKMKHSGMTFDErVNIARQMRH 
RSIAREI^GTDCEILGTAQSVGCNVDGRHP 
HDHDDINSGAVECPAS 


2808 


A 


1094 


483 


IGCDVLINNAGIFQCPYMKTEDGFEMQFGV 

NHLGHFLLTNLLLGLLKSSAPSPJVWSSK 

LYKYGDINFDDLNSEQSYNKSFCYSRSKLA 

NILFTFELARPXEGTNVTVK^HPGr/RTN 

LGPJT»NTFHCWSNHSSIW/WSWAFFKTPVE 

GAQTSryLASSPEVEGVSGRYFGDCKEEEL 

LPKAMDESVARKLWDISEVMVGLLK 


/.ovjy 


A 

A 


177^ 


1Q81 

I/O 1 


HTWQNSLIVLFRGCRSAHAKVHRWKN*LP 

LNLAPLLPRSGSSAPIRPPPSAQARQPMKST 

YGVDRRHS 


2810 


A 


272 


51 


MLLLSSSLLKCGTCQWQVQPAVAGSLEGG 

EEESMVSALLISALPFLGTSHVTVETLDVQ 

YTVFPKLICFLPCE* 


2811 


A 


3 


357 


FGFNGCSKRIIKLQELSDLEERENEDSMVPL 
PKQSLKFFCALEWLPSCDCRSPGIGLVEEP 
MDKVEEGPLSFLMKRKTAQKLAIQKALSD 
AFOKLLIWLG/ODCLDHP*STSVSVSK 


2812 


A 


94 


3006 


RTRSLTRKAMAEHAPRRCCLGWDFSTQQV 

KWAVDAELNVFYEESVHFDPJDLPEFGHV 

LDVHGVHVHKDGLTVTSPVLMWVQALDII 

LFJKMKASGFEFSQVLALSGAGQQHGSIYW 

KAGAQQALTSLSPDLRLHQQLQDCFSISDC 

PVWMDSSTTAQCRQLEAAVGGAQALSCL 

TGSRAYEFNLVCDRKHLKDTTQSVFMAGL 

LVGTLMFGPLCDRIGRKATILAQLLLFTLIG 

LATAFVPSFELYMALRFA\GLLPSLDLASA 

MSPY*QNGWGPHGGRRPWSWPSATSPSGR 

WCLRDSPTVSATGGSFRSPALRLAYCSS\LL 

LGSARICTLAPDPWEDGRGDTTDPENGLG 

Q*AETLPGAHEPAGPREDRPLRECPGSVQT 

PPAPEGDPDYLLCLVCGQSGVLRPEPPSGG 

LRPGRLSDAAHLWSC*GACPLFQHLHDAE 

VWPQVEP/RWGPWSWVA*CVSSSSSSQQIC 

PWWSPCWLWWGKWPQLLPLPSPMCTLPS 

FSPPSSGRQAWGWWASSHGSGASSHHL*S 

CWESTTLPSPCSSTAASPSWPA/SLCTLLPE 

THGQGLKDTLQDLELGPHPRSPKSVPSEKE 

TEAKGRTSSPGVAFVSLGTSDTLFLWLQEP 

MPALEGHIFCNPVDSQHYMALLCFKNGSL 

MREKIRNESVSRSWSDFSKALQSTEMGNG 

GNLGFYFDVMEITPEnGRHRFNTENHKYF 

KGKGAPGHPMPSLKANFDLLACLRGVGSS 

TLLLWPAVLGAQTRQAGVNEGRSQVADF 

LRIPVTGCPEQRRNPPSPPAPLGTGGPAEER 

LQFPGVAGSRRGRGRILRAGGIGRASPGEG 

TGAPRPRAGQGRGGPGKPESGGGGPVALR 

PGDCTCCVLKSQPRQQRRGACSAMAFRVR 

LRVRQSVRPPRGVTVAALQRPETQGPAPSS 
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Table 8 



SEQ 

m 

VTA, 


Method 


Predicted 
beginning 

mm mm aiaati n A 

nucieonae 
location of 
Grst amino 

UC1U iwlUUc 

of peptide 
sequence 


Predicted 
ending 

location of 
last amino 

nriH rpsiriiie 

nvlU I Wiuuv 

of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










ARPDCGPESRGGLALWRRLRGYASRDRVL 

CNRRCPHAARFPSKRTPSGSPHLHLMSSW 

AVP 


2813 


A 


1 


897 


MTYGVGKGDMVDGTKERGERIESALGTS 

HIMRVAEPQGSQSWCPDEELRPVGSPATA 

AQKLPSTPGALGPTHSTECCSIPLDPKAQQ 

GLQKTVKDLKAQGLVKPCNSPCNTPILGVQ 

KPNGQWRLVQDLRIINEALVPLYPAVPNPY 

TLLSOIPEEAEWFTVLGLKDDFFCIPVHPDS 

QFLFAFEEPSNPTSQLTWTVLPKGFRDSPH 

LFGQVLAQNLSQFSYLDTLVLRYVDDLLL 

AARSETLCHQATQALLNFLTTCGYKVSKP 

KAQLCSQEVTYLGLKLSKGTRALSEERIQP 

TLA 


2814 


B 


71 


2167 


XPAEAIJODGEERQKNKKKA^ 

RAKEYESLMETKNSGSDSPYKAKLQRLAK 

DLLKQVQVQDSGSWANNKVSALDRTLGEI 

TRILEKENVADQIAFQAAGGLTALEHILQA 

VWATJWKTVLRNSSMPQDSYMQCVTLCF 

AVTGRSYSIFDNNRQDPTGLTAALQATDL 

AGVLHMLYCVLFHGTILDPSTASPKENYT 

QNTIQVAIQSLRFFNSFAALHLPAFQSIVGA 

EGLSLAFRHMASSLLGHCSQVSCESLLHEV 

IVCVGYFTVNHPDNQGDRAVRPPPHSAAK 

SSASCPSSISVTHG 


101 c 


A 


i 
i 


*f / j 


FVRWNSPPTDSLSPDGGSIELEFYLAPEPFS 

MPSLLGAPPYSGLGGVGDPYAPLMVLMCR 

VCLEDKPDCPLPCCKKAVCEECLKVYLSAQ 

IQCPTCQFVWCFKCHSPWHEGVNCKEYKK 

GDKLLRHWASEIEHGQRNAQKCPKCKIHI 

ORTEGCDHM 


2816 


A 


i 


1286 


RGAVFPGPEHSVPEESVTFEDVAWFTDEE 

WSRLVPIQRDLYKEVMLENYNSIVSLGLPV 

PQPDVEFQLKRGDKPWMVDLHGSEEREWP 

ESVSLDWETKPEIHDASDKKSEGSLRECLG 

RQSPLCPKFEVHTPNGRMGTEKQSPSGETR 

KKSLSRDKGLRRRSALSREILTKERHQECS 

DCGKTFFDHSSLTRHQRTHTGEKPYDCRE 

CGKAFSHRSSLSRHLMSHTGESPYECSVCS 

KAFFDRS SLTVHQRIHTGEKPFQCNECGKA 

FFDRSSLTRHQPJHTGESPYECHQCGKAFS 

QKSILTRHQLIHTGRKPYECNECGKAFYGV 

SSLNRHQKAHAGDPRYQCNECGKAFFDRS 

SLTQHQKIHTGDKPYECSECGKAFSQRCRL 

TRHQRVHTGEKPFECTV CGKVFSSKSSVIQ 

HORRYAKQGID 


2817 


A 


94 


255 


MLYDBCKSHKLVAPLAVFFALFFLLIFFWV 
AFSYPFELLFLOLRSRQADIGVQ* 


2818 


A 


551 


19 


TGTTOKLQGSGPHLLRDWAFHPPWRKICL 

HCXCPQEEHMVTVMPLEMEKTISKLMFDF 

ORNSTSDDDSGCALEEYAWVPPGLKPEQV 
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Table 8 



SEQ 

ID 

NO- 


Method 


Predicted 
beginning 

U UUCUUUc 

location of 
first amino 
acid residue 

sequence 


Predicted 
ending 

rmplpntfrlp 

location of 
last amino 
acid residue 

ui pcjjuue 

sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 

Hplpfinn =nnccihlp niirlpnfiHp in^prtinrA 

UCICUUII9 IJilaSlUlC IIUUCUIIUC UI3CI LIU 11^ 










HQYYSCLPEEKVPYVNSPGEKLPJKQLLHQ 
LPPHDNEVRYCNSLDEEEKRELKLFSSQRK 
RFNT rjPO>rVPPFPVTMTfiATPFr)V t ?Mr) < ;(i 

XxJDlN JUvJJxVJIN V X\rrr V 1 IVl 1 VXrYlV^Iiv^ V olYLLyOVJ 

Y 


2819 


A 


236 


559 


MWLEPMQMGFLHMMEKMAARTSAILD*G 
TLK*FHFTLTTSLKALSSHTPIFPGTGELQLP 
VSPSVCLDQGMQLKPSTSSHLLKTVKPRM 

VPPQT T WN/TK'n^FFPK'TVT 


2820 


C 


209 


592 


METETKESGKNKKIPPKHQIENVGVGGLG 
AQDGLNQIGKIPPVLSCSQSRFGTMPAAFP 
CVFPPQSLQVSPQMSSKAWEKQSLPLPGLR 

<"}<?P VFR TfMP'MVnT PT PVPT RNTFTOPROKPV 

LFWRKANR 




A 


JOl 


j j 


GYSLWRRFLSRSEKRNIRVGVTRFSRCV/L 
SPLSLTQKGNSLTPCASQVRQCLALLRLAH 
GACTHWPAPTVWHSLVR 


2822 


c 


2 


166 


MQKRHNCKKVHALPPAVLGFQRASGCRF 
ANKRSRITHFGGRRLSLTPASDSAGV 


2823 


A 


164 


423 


RGPVSRNQPPFTPJPQTRKTTETHVRGQSL 
PRPGTQSLQTKAAQVPSPQRLPKNPE*AV 


2824 


A 


792 


389 


PTRPPLVQLQAPRAHLSEDQKRLLLMKQKG 

VMNQPMAYAALPSHGQEQHPVGLPRTTG 

PMQSSVPPGSGGMVSGASPAGPGFLGSQP 

OA ATN/TSrOMT TDOR AOT WClCiK HnPT RPf>R 

QQQQQQQQELAEQVTCPLA 


2825 


B 


1279 


1479 


MVPLCQVRVAGVRAGLALVSRTSPLAPNL 

AGVLGSGAPPPPPPGPSCLRALLRLPQQKS 

GPLRELLSAHGSKDGLWKAPTHFYDHLF 

PPT FVT MTf T K F 


2826 


A 


1 


412 


MKALLALPLLLLLSTPPCAPQVSGIRGDAL 

ERFCLQQPLDCDDIYAQGYQSDGVYLIYPS 

GPSWWWCDMTTEGGKWTWQKRFNGS 

VSFFRGWNDYKLGFGRADGEYWLGLQNM 

HLLTLKQKYELRVDLEDFEN 


2827 


A 


3 


711 


KIADFGFSNLFIPGQLLKTWCGSPPYAAPE 
LFEGKEYDGPKVDIWSLGWLYVLVCGAL 
PFDGSTLQNLRARVLSGKFRIPFFMSTECE j 
TJT TPHA/fT VT TYPTdlfRT ^MFOTPK'TTK'WMTCT 

GDADPNFDRLIAECQQLKEERQVDPLNED 

VLLAMEDMGLDKEQTLQSLRSDAYDHYS 

AIYSLLCDPJ3KRHKTLRLGALPSMPRALGL 

SSTSQYP\AEQAGTAMNISVPQVQLINPENQ 

IV 


2828 


A 


1350 


2203 


TWRLDPOUSSPKPOPGGTYTLEWKSSKSK 
KVLSPHP * WPPLRLWQR\GGSPEGGTQAPD 
GSLPPPPPRPKSERVGSPKLSGGKR/EGSHP 
GGPPHTTHP/DGEEKAKSSWFGLREAKDPT 
QKPSPHPVKPLSAAPVEGSPDRKQSRSSLSI 
ALS S GLEKLKTVTS GSIQP VTQ AP Q AGQM 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 
nuucuuuc 
location of 
first amino 

opid rpcidnp 
aliu 1 csiuuv 

of peptide 
sequence 


Predicted 
ending 

niiclpntiflp 

location of 
last amino 

arid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










VDTKRLKDSAVLDQSAKYYHLTHDELISL 
LLQRERELSQRDEHVQELESYIDRLLVRIM 
ETSPTLLQIPPGPPK 


2829 


A 


2 


259 


WQGGILGSDPTPPLTSPNLLQTACFREERD 

V/RRERGQPLGDHSALCLPRRGVPVPCDGL 

LCWWGPPDAAEPLRGPSPARAGPVLPG 


2830 


A 


1 


1062 


MTADAVLIKNfGSKDADWEYEEGDKLEEFL 

RSLNSSKPLYLGQTGLGNIEELGKLGLEPG 

ENFCMGGPGMIFSREVLRRMVPfflGECLRE 

MYTTHEDVEVGRCVRRFGGTQCVWSYEG 

RCSFRVVPDSATEFSMDFEKILMLDPTLHPL 

CQNLLQRLNTMWKPPNVGLVPSKATAQA 

VRWSLLAMARAGAATMPGALSQGCIEVS 

RLLKKLPDDEGITMDTVGFAPLCLWQRLT 

LANHQRYFADGPQPVCNHMQPAPHHFAS 

MRSSAASPTSLPAFADPAAVPPLEHVYVW 

TLLLCQRWCTYMYMDSTATTLTKHCCCPP 

PIPPIGVLLPADWGHIGPSSDSRSENKAMGS 

SPST 


2831 


A 


2 


238 


TKLNPKIMDVGWPELHAPPLDPCMCTICKA 
QESWLNSNLQHVVVIHCRGGKGRIGWISS 
YMHFTNVSAR*DEDVSSLS 


2832 


A 


3 


162 


RLHTANLGDSGFLWRGGEWHRSDEQQH 
YFNTPFQLSIAPPEAEGWLSDR 


2833 


A 


1 


988 


MPAEFFQRCSVIMVQLPWKEAHVERPHGE 

RDYTPDLQPDMWEKFPGLRRALRPVVKTL 

LVQLEYRQAEKCEKRDWPSLPDYIFLLCW 

MLPALEYRTPSSSVLELRLALRAPQPADSL 

LWDLVIVPITSLKSWQTPRGEVEGVTHEEI 

CASLKSLAVALLSMSDLTVGTPVTQPQTL | 

NTMGIIGSRGGRGOVAALNR.OROVPELIIGI 

DILSSWQNPfflGSLNGRGYINSLALCHNLIR 

RDLDRFLLPQDITLVHYIDHIMRLDSVKDK 

WlilLAPPTTKKEAQCLVGL/FGFWRQfflSH 

LETAI7RPVTGLWWKLNI*LWAIKSPCNLN 

CLS 


2834 


A 


4061 


2827 


EAGP APLSAAAP GAGRGWPRPLAERRKGR 

GRRQPLRARLNRRRWAAGQGSTVQAATF 

GPAMAAAPLKVCrVGSGNWGSAVAKIIGN 

NVKKLQKFASTVKMWVFEETVNGRKLTDI 

INNDHENVKYLPGHKLPENVVAMSNLSEA 

VQDADLLVFVEPHQFIHRICDEITGRVPKKA 

LGITLIKGIDEGPEGLKLISDIIREKMGIDISV 

LMGANIANEVAAEKFCETTIGSKVMENGL 

IJFTCELLQTPNFPJTVVDD ADTVELCG ALKN 

rVAVGAGFCDGLRCGDNTKAAVIRLGLME 

MIAFARIFCKGQVSTATFLESCGVADLITTC 

YGGRNRRVAEAFARTGKTIEELEKEMLNG 

QKLQGPQTSAEVYRELKQKGLLDKFPLFTA 

VYOICYESRPVQEMLSCLOSHPEHT 


2835 


A 


106 


1814 


OLLPTDTPTGNSSPSLPHLPFAGACGLSIYN 
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Table 8 



SEQ 

ID 

INU: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucieouuc 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /^possible nucleotide 

deletion =nnssible nucleotide insertion) 










LVPTQQKRWSGSSGFILSRKFrNYSPVPPS 

LQMFFRLQLPPVNSEETSHYEEPLPGRRVEL 

RYPLRQGTEATDGQVCGNEDMLIPJDRVRK 

TRGSAPPPAHNLAPTEVALEDVLRIFTSAW 

RGVDGALEKGGTSCPARAQLPAEPEDPLF 

RCLRVSRLKDREVRGLGLPRQLQGVWSTT 

YPRRHAIAEHAGSPKPLRKREPETWQANK 

KGVIGIQLVVTMVMASVMQKIIPHYSLAR 

WLLCNGRKYNGHffiSKPLTIPKDIDLHLET 

KSVTEVDTLALHYFPEYQWLVDFTVAATV 

VYLVTEVYYNFMKPTQEMNISLVCKVLFS 

LTTHYFKVEDGGERSVCVTFGFFFFVKAM 

AVLIVTENYLEFGLETGFTNFSDSAMQFLE 

KQGLESQTLLHINFIAPLFMVLLWVKPrrK 

DYIMNPPLGKESEPLMTEATFDTLRLWLnL 

LCALRLAMMRSHLQAYLNLAQKCVDQM 

KKEAGRISTVELQKMVARVFYYLCVIALQ 

YVAPLVMLLHTTLLLKTLGNHSWGYLSRI 

YLYLTSG 


2836 


A 


2 


774 


HSYSHSHGHCGSPAGDTEQGYKPVWPVCS 

LFPDGSHPGV*QPIHEPA/QGRGGLPPWGA 

A*TPRAWRLA*RPRG*AALPWA*TSPGRPA 

9API AHTOSGPPSRPTRAPGPSP/IPIONIKR 

PYPGEAFVPSRAGVPTVGVTRSFHLAPSLPP 

FPSS*I^PSLPPRTTTSCTRAILTPSS*QKLLY 

PPSRPWVLLVRRARPPAAAPTSEEPPERSP 

WETPHAAPSQLHELHETHSVAQKSDLLPA 

PEAM*PGSVSSRFLLY 


2837 


A 

A 


l 




P<IA AWAPKT OT T SVCROOLPGNPRARSHS 

HHRRTRARCPSGCGQARHSAGSWHKLQFP 

LCTWKMRSPLKMRSLliCMPSESIxMVVTF 

LISALESTEQYHGGVYTPCDIDSNIILSPPDI 

SNNITEGVYTPCDIDRHLIPFFIJPLDMRLQV 

LMPLD SGTCTSGFPE ALRPS ASD 


2838 


A 


14 


1256 


WPCGAAPGLTHASERMFTLTTMIQALAPV 

MGWDRKPIJCMFSSEEMRGHLHHHHKCLT 

KILKVEGQWDLJPSCLPLTDNTRMLASILIN 

MLYDDLRCDPERDHFRKICEEYTTGKFDPQ 

DMDKNLNAIQTVSGILQGPFDLGNQLLGL 

KGVMEMMVALCGSERETDQLVAVEALIH 

ASTKI^RATFHTNGVSLIiCQIYKTTKNEKI 

KIRTLVGLCKLGSAGGTDYGLRQFAEGSTE 

KLAKQCRKWLCNMSIDTRTRRWAVEGLA 

VT TT D AD VKDDFVOD VP ALO AMFELAKT 

SDKTILYSVATTLVNCTNSYDVKEVIPELV 

QIJ^KFSKQHWEEHPKDKKDFIDMRVKRL 

LKAGVISAIACMVKADSAILTDQTKELLA 

RVFLALCDNPKDRGTIVAQGGGKALIPLAL 

EGTD 


2839 


A 


1913 


1582 


EDSGLRLLWICLSLSLSFP*NRVSLCHPGWS 
AVARPQLTAARPSRLQOSSHLSLQSTWDH 
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Table 8 



SEQ 
ID 

inu: 


Method 


Predicted 
beginning 
nucleotide 
location of 
iirsi amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

mini onfi rl o 

nucieuuuc 
location of 

lact q mi Tin 
last dill* Liu 

acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 

/fola+inn =nncciHlp nnrlpntirip insprtion i 










RHTPPYLALFFIFLFLVDM\SFTTWRP\^ 
WAQABLPFRPLKVLGLLA 


oc /in 


A 
J\ 


A A 




MYMLLO AF WLWOETLKTILLYKFTKPP AN 

XVX X 1 v *, 1 ■* 1 J Vjf ^* 1 T T X*> T V yiJ X JUXV A 1 1 il > X XWX X Au a x a-l t 

TPVLGVNAQVCHSCLAALRIRKVNGHKRN 
FKAQPPNGKLPLVLGCLCLLTDLIHALGYD 
CRRDFPVSLEYAELVFLF WAY * 


2541 


A 

A 


JZZ 




T TYRFT VFT f>OFT PP.P < ?*:SFT*^4T PGFPAAAY 
GPVAAAAVAAARGSGRKVYGTGDSQA 


2842 


A 


87 


439 


KTWTPQPRHPPPHPETSKPTPPC*GPVLCSC 
UCVMPRPLPP/PP*DLCSPPLLAPGPRRSAG 
GCWACQRRKKMSCLGGAGVCLKQGHGH 
MOT rVDT Ci\ STT AEPPGSSARRLPARSAL 


2843 


A 


1 


409 


MAETAVINHKKRKNSPRIVQSNDLTEAAY 

SLSRDQKRMLYLFVDQIRKSDGTLQEHDGI 

CEIHVAKYAEIFGLTSAEASKDIRQALKSFA 

GKEWFYRPEKDAGDEKGYESFP\WFIKHS 

TN1TSLSLWFFSSCTH 


2844 


A 


1 


894 


MPGPMSLWLLLLVLPLSLEHSDLRICFPGQ 

WSMESSSTGFIWTDVRAWQTSNRHVSSW 

REPRHSRMPPGAGLMERIQAIAQNVSDIAV 

KVDQILRHSLLLHSKVSEGRRDQCEAPSDP 

KFPDCSGKVEWMRARWTSDPCY AFFGVD 

nTRP<3FT TYT WVFWFPPPT PWRNOTAAOR 

APKPLPKVQAVFRSNLSHLLDLMGSGKES 

LIFMKKRTKRLTAQWALAAQRLAQKLGA 

TQRDQKQILVHIGFLTEESGDVFSPRVLKG 

GPLGEMVQWADILTALYVLGHGLRVTVSL 

KELQR 


2845 


A 


2 


1841 


TNDKNEIMITSVDGEKAFDKIQQPFMLKTL 

NKLVIJEVLARAIRQEKGIKGIQLGKEEVKL 

SLFADDMXVYLENPIVSAQNLLKLISNFNK 

VSGYICINVQKSQAFVYTNNRQTESQIMSEL 

PF1TASKRIKYLGIQLTRDVKDFFKENYKPL 

LNEIKEDTNKWKXIPCSWGPJD^IVKMAIL 

PKVIYPJFNAIPIXLPM'IWIKIEKTTLKFIW 

NQKRAHIAKTIl^QKNKAGSIALPDFKLYW 

KATVIXTAWYWYQNRDIDQWNRIEPSEITP 

HIYNHIIFDKPDKNKKWGKDSLFNKWCW 

ENWLAICRKLKLDPFLTPYTKINSRWIKDL 

NVPJ'KTIKTLEENLGNTIQAMGMGKDFMT 

ETPKAMATKAKEDKWDLIKLKSFCTAKET 

TIRVNRQPTEWEKIFTIYPSDKGLISRIYNEL 

KQINKKKSNNPimWAKDMNRRFSKEDIY 

/VTxXN X\XxiVJJ>k_rV. uOO Xj/\XXVXv1Y1 V^/XXV X I XVXXV X XXXw X 

PVRMAIIKKSGNNRCWRACGEIGTVGYKN 

DRQETQRTRlULHMLEDKPYGEINQrFLQV 

GQRKNGYARPQKSCLPCNIFQYVFQKKMK 

EKTKKEKKWNLGNTRIKPEKGKENMGGT 

VLPPSSPHWVEYEPPVSSP 


2846 


A 


60 


493 


EAGKRESSRDKGARCVYTRHGLRASIPAP 
GLRSRRGEQGCSGIRPSCGKRLVCPGCRNQ 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucieouuc 
location of 
first amino 

aClU 1 volUUC 

of peptide 
sequence 


Predicted 
ending 

mirlpfitide 

UUUvUUUv 

location of 
last amino 
arid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










ENPEGNRGKGAARFTRESASGRGESRSAR | 

GSBERSGDMRTYWLHSVWVLGFFLSLFSL 

OGLPVRSVDFNRGTDNITVROGDTAIL 


2847 


A 


395 


3 


GGQGVTPWPSSCLPGTGSPAPSPTRLLGPT 

PRDRAEATVGPDSATCSQTEGAQEGGRCLP 

PG/MELPAGDGAGRRVGQGGPEGQLGGQQ 

RGKGAGPQPPPQEQPGLAWVGDRLEHPRL 

CLPPTCGHRAGSPGW 


2848 


A 


514 


738 


MNSLSWGAANAVLLLLLLAWASPTFISINR 
GVRVMKGPSAFLSGDDMKFAIPKEKDACC 
IRESSTRXXRSGSAGL 


2849 


A 


2 


427 


HVIKVLHDDWIFTPFIQGP *SM/CSSKNESR 

fflGS*RVTG*LLEVLKSLL*SFGRLNALNM 

KSL/TSEVQEE*RKLNKTHRVQRDFDKDRK 

LAVGQSESPGHPTSEKPPSTSSSAGCMLCS 

LfflSRGFQLRRKRQLNGKCCPIQ 


2850 


A 


3 


409 


RQEGEDSAGSWHSQGPGQCQGRAKAGSG 
P**/GPATGLGLGQ" , QDQSQGKGQSSARPG 
♦GQAFQGQGQGRTRARSEAGKGQGQDRS 
RAGP*HGQGLR*GKGRARAR*GSGPRPG* 
GQGKKYGRTRGNAKAKAGPGLT 


2851 


A 


174 


446 


MWLLPALLLLCLSGCLSLKGPGSVTGTAG 
DSLTVWCQYESMYKGYNKYWCRGQYDT 
SCESrVETTGEEKGGKEWPRVHQRPPGGSR 
LHCDH 


2852 


A 


1008 


1246 


INNLSWQDYGESP*ALSNQTS*WPILRPFIP 
VFLLLLFHLVFQFIQNRIQATTNHSI*QMFLL 
TTPOYHPLPODLPSA 


2853 


B 


428 


3792 


MSFDPNLLHNNGHNGYPNGTSAALRETGV 

IEKLLTSYGFIQCSERQARLFFHCSQYNGNL 

QDLKVGDDVEFEVSSDRRTGKPIAVKLVKI 

KQEBLPEERMNGQWCAVPHNLESKSPAA 

PGQSPTGSVCYERNGEVFYLTYTPEDVEG 

NVQLETGDKINFVIDNNKHTGAVSARNIM 

LLKKKQARCQGWCAMKEAFGFIERGDV 

VKEIFFHYSEFKGDLETLQPGDDVEFTIKD 

RNGKEVATDVRLLPQGTVIFEDISIEHFEGT 

VTKVIPKVPSKNQNDPLPGRIKVDFVIPKEL 

PFGDKDTKSKVTLLEGDHVRFNISTDRRDK 

I£RATNffiVLS>rrFQFTNEAREMGVIAAMR 

DGFGFIKCVDRDVRMFFHFSEILDGNQLHI 

ADEVEFTVWDMLSAQRNHAIRIKKLPKGT 

VSFHSHSDHRFLGTVEKEATFSNPKTTSPN 

KGKEKEAEDGIIAYDDCGVKLTIAFQAKD 

VEGSTSPQIGDKVEFSISDKQRPGQQVATC 

VRLLGRNSNSKRLLGYVATLKDNFGFIETA 

NHDKEIFFHYSEFSGDVDSLELGDMVEYSL 

SKGKGNKVSAEKVNKTHSVNGITEEADPTI 

YSGKVIRPLRSVDPTQTEYQGMIErVEEGD 

MKGEVYPFGIVGMANKGDCLQKGESVKF 

QLCVLGQNAOTMAYNITPLRRATVECVKD 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 

A olo+i/in =nnccihlp mirlPfitiHp inSGrtion^ 
UcIcllOII) pUaalUlC 11UUCUUUC uuu^ 










QFGFINYEVGDSKKLFFHVKEVQDGIELQA 

nFiTTArcpQVTT lsinp TnT^P^s A PNVAATR VPEGP 
yjLfc, V Ur o v i.L<rN v<rv *■ \J1v.v-'0/\v^in v tv jx v v^j^vjx 

KAVAAPRPDRLVNRLKNITLDDASAPRLM 
VLRQPRGPDNSMGFGAERKIRQAGVIDXN 
WRKQKCFVFIXINGLFTQRSKPQTTRGKIK 
PPSPTSPELTLVILDKAFSPLARDPVYGQFK 
KRAKKSDPSIPVI 


2854 


A 


1 


747 


MRLQRPRQAPAGGRRAPRGGRGSPYRPDP 

GRGARRLRRFQKGGEGAPRADPPWAPLGT 

MALLALLLWALPRVWTDANLTARQRDP 

imQfYR TTlFnTYNTR VWOH VPFRENTFECONP 

RRCKWTEPYCVIAAVKIFPRFFMVAKQCS 

AGCAAMERPKPEEKRFLLEEPMPFFYLKC 

CKIRYCNUGGAyNLSTHQNCSKNMLGAWV 

RAWGCGWPSSCCWPPLQPASACLEPRDC 

HRLSLPEHGLAPDRCHLLH 


2855 


A 


3 


1018 


FASFPSINLQQMLKEVPKRFGDERGAIVHY 

TLLNNHVYRRSLGKYTDFKMFSDEILLSLT 

RKVLLPDLEF YVNLGDWPLEHRKVNGTP S 

PIPnSWCGSLDSRDWLPTYDITHSMLEAM 

RGVTMDLI^IQGNTGPSWINKTERAFFRGR 

DSREERLQLVQLSKENPQLLDA/WNYRIFL 

TTD'DP'D Vn A\*1^ AlfT TJIdl T TYTrT^RNVDOTV 

J*JrJNJil\JS.vJA\ JS-f\JV-L»lVlVJ I-iLil J lv>i IVL^ V L/KJ X V 

AAYRYPYLMLGDSLVLKQDSPYYEHFYM 
ALEP WKHYVPIKRNLSD LLEKVKW AKEN 
DEEAKKIAKEGQLMARDLLQPHRLYCYYY 
QVLQKYAERQSSKPEVRDGMELVPQPEDS 
TAICQCHRKKPSREEL 


2856 


A 


3. 


3707 


RAGEWPGWLLAAAAAHPGRPAASLSPGL 

GAVLGVAGRQVADPRFRRDWFRIPSPPAE 

SAGPARQAGFAAAPPARAGPALSTMKGTR 

AIGSVPERSPAGVDLSLTGLPPPVSRRPGSA 

ATTKPIVRSVSVVTGSEQKRKVLEATGPGG 

SQAJNNLRRSNSTTQVSQPRSGSPRPTEPTD 

FLMLFEGSPSGKKRPASLSTAPSEKGATWN 

VLDDQPRGFTLPSNARSSSALDSPAGPRRK 

ECTVALAPNFTANNRSNKGAVGNCVTTM 

VHNRYTPSERAPPLKSSNQTAPSLNNIIKAA 

TCEGSESSGFGKLPKNVSSATHSARNNTGG 

STGLPRRKEVTEEEAERFTHQVNQAAVTIQ 

RWYRHQVQRRGAGAARLEHLLQAKREEQ 

RQRSGEGTLIJDLHQQKEAARRKAREEKAR 

QARRAAIQELQQKRALRAQKASTAERGPP 

ENPRETRVPGMRQPAQELSPTPGGTAHQA 

LKANNAGGGLPAAGPGDRCLPTSDSSPEP 

QQPPEDRTQDWLAQDAAGDNLEMMAPSR 

GSAKSRGPLEELLHTLQLLEKEPDALPRPR 

THHRGRYAWASEVTTEDDASSLTADNLEK 

FGKLSAFPEPPEDGTLLSEAKLQSIMSFLDE 

MEKSGQDQLDSQQEGWVPEAGPGPLELGS 

EVSTSVMRLKLEVEEKJCQAMLLLQRALAQ 
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Table 8 



SEQ 
ID 

inu: 


Method 


Predicted 
beginning 

viiiaI AAT1/1 A 

nucieoiiae 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

niirlpnfide 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possibIe nucleotide insertion) 










QRDLTARRVKETEKALSRQLQRQKEA\YE 

ATIQRHLAFTD QLEEDKKVLSEKCEA WAE 

LKQEDQRCTERVAQAQAQHELEKKLKEL 

MSATEKARREKWISEKTKKIKEVTVRGLEP 

EIQKLLARHKQEVRRLKSLHEAELLQSDER 

ASQRCLRQAEELREQLEREKEALGQQERE 

RARQRFQQHLEQEQWALQQQRQRLYSEV 

AEERERLGQQAARQRAELEELRQQLEESSS 

ALTRALRAEFEKGREEQERRHQMELNTLK 

QQLELERQAWEAGRTRKEEAWLLNREQE 

LREEIRKGRDKEIELVIHRLEADMALAKEE 

SEKAAESRIKRLRDKYEAELSELEQSERKL 

QERCSELKGQLGEAEGENLRLQGLVRQKE 

RALEDAQAVNEQLSSERSNLAQVIRQEFED 

R\LAASEEETRQAKAELATLQARQQLELEE 

VHRRVKTALARKEE AVS SLRTQHKGSWK 

RADHLEELLKQHRRPTPSTKCPGMPGTLFK 

NGRQRTKAGRGPRGPQGRPPAPHRGWWL 

RCTRLSTCGCILTVKEAVVFSKKKKKGAPF 


2857 


A 


1 


2064 


MTASIRRYHTCATDGEPDSS VLVGGDGDL 

TLLVAALGLDLGLPFMLLPPLMEWMRVAI 

TYAEHRRSLTVDSGDIRQAARLLLP/GPEH 

CTSSFR\RLDARAATEKFNQDLGFRMLNCG 

RTDLINQAIEALGPDGVNTMDDQGMTPLM 

YACAAGDEAMVQMLIDAGANLDIQVPSNS 

PRHPSIHPDSRHWTSLTFAVLHGHISWQL 

LLDAGAHVEGSAVNGGEDSYAETPLQLAS 

AAGNYELVSLLLSRGADPLLSMLEAHGMG 

SSLHEDMNCFSHSAAHGHRGIWGLVTLGP 

LACLEEEDHETPSPRVPQSSPSGQEGTGGQ 

LRNVLRKLLTQPQQAKADVLSLEEILAEGV 

EESDASSQGSGSEGPVRLSRTRTKALQEAM 

YYSAEHGYVDITMEIJRALGVPWKLHrWIE 

SLRTSFSQSRYSWQSLLRDFSSKEEEYNE 

ELVTEGLQLMFDELKTSKNDSVIQQLATIFT 

HCYGSSPIPSIPEIRKTLPARLDPHFLNNKE 

MSDVTFLVEGKLFYAHKVLLVTASNRFKT 

LMTNKSEQDGDSSKTIEISDMKYHIFQMM 

MQYLYYGGTESMEIPTTDILELLSAASLFQ 

LDALQRHCEILCSQTLSMESAVNTYKYAKI 

HNAPELALFCEGFFLKHMKALLEQ\MPSGS 

SSTAAAAKCRAWmCRTCRTPWQSACTLS 

TSPPGSAA 


2858 


A 


1 


571 


FRPGRRAKRAMAVYVGMLRLGRLCAGSS 

GVLGARAALSRSWQEARLQGVRFLSSREV 

DRMVSTPIGGLSYVQGCTKKHLNSKTVGQ 

CLETTAQRVPEREALWLHEDVRLTFAQL 

KEEVDKAASGLLSIGLCKGDRLGMWGPNS 

YAWVLMQLATAQAGIILVSVNPAYQAME 

LEYVLKKVGCKALVFPKQ 


2859 


A 


2737 


2600 


MCCWIWFASILLRIFALMFIRDIGLKFSFFV 
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Table 8 



SEQ 
ID 

INUJ 


Method 


Predicted 
beginning 
nucleotide 
locauon oi 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucieou tie 

IUC4UULI Ul 

last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 

riplpfinn =r»nc«ihlp rtiiplpntifif* insertion^ 










VSLPGFGERMMLAS* 


2860 


A 


1 


1353 


MVKLSIVLTPQFLSHDQGQLTKELQQHVK 

SVTCPCEYLRKVSLLKTIFWSRNGHDGSTD 

VQQRAWRSNRRRQEGLRSICMHTBCKRVSS 

FRGNKIGLKDVITIJIRHVETKVRAKIRKRK 

VTTKINHHDKINGKRKTARKHTGDCHPGE 

WGQAHFVPDSPVHIALHGMAQPLFGIQG 

GALEPAGRGTGFLDSPVFRPIRKYNVQIPPS 

ARKALCNWSLLLVCVGKPEBFVAIHYYTPN 

TKLVPLARPRNSHVPHPPERTTVTQYSTCA 

LLTALCLLLPVLQETAQSRRMVTSHPEDSP 

AT APTfTJOASOPAOT GFTTOTOTVTPAJFTFOT 

PTAAEPALLSAWLGRAPETETITDMAGSA 

AAAPTCEMLRAHGHDDLYFKWEPCASSQ 

AITVLPKHSGTGGSRQGPAVAHPAAPFPKV 

RGGEGTYYLHLSVFSDLVDLHLLHVGORV 

VQGLRLRL 


2861 


A 


1553 


1896 


CSSFCFPFPRSRPTAPRPDHRPAEPQRLHSA 
EGAPEWGPTSDPHHHPCPGGAPGGTQDP 
KMAAEAPQQPNSDWAGEISMCRGSTHQL 
DMAF^FTFT5JATi?G5?5?RGRPAGKESC 


2862 


A 


262 


129 


SGLFIJETTPFPPFLPLPLCKHQIRDEWGNQI 
WICPGCNKPDDGSPMIGCDDCDDWYHWP 
CV GMTAPPEEMQ WFCPKCANKKKDKKH 
KKRKHRAH*RDDYKMLFMTYKRKLRIFV 
RNALSLNT 


zao3 


A 

A 


i 

0 




I VnPRVBiVBI OT T PI T T ^P AOGMPGAST T) 

GRPGDRVNLSCGGVSHPIRWVWAPSFPAC 
KGLSKGRRPILWASSSGTPTVPPLQPFVGR 
LRSLDSGIRRLELLLSAGDSGTFFCKGRHE 
DESRTVLHVLGDRTYCKAPGPTHGSVYPQ 
LLIPLLGAGLVLGLGALGLVWWLH 


2864 


A 


1 


553 


RTRGRTRGLVKKWASHHQINDASRGTLSS 

V<?T VT MVT T-TYT OTT PFPTT PSI OKTYPESFS 

PAIQLHLVHQAPCNVPPYLSKNESNLGDLL 

LGFLKYYATEFDWNSQMISVREAKAIPRPD 

GffiWRNKYICVEEPFDGTNTARAVHEKQK 

FDMTKDOFLKSWHRLKNKRDLNSIIJPVRA 

AVLKR 


286S 


A 


516 


848 


MWSLWIWVDQHQARLIPSPQVLLLLLRET 
PSTAAAVAGWLWASMALLQLHAVGGVA 
LTSSHPFMWATGEELRKPPWQGSAGSASG 
VEELTGKHSCPGPEEPATVQKAPA* 


2866 


A 


349 


1018 


TFTQPDPDDLISKPPRTPGGG*YQTQWPSPP 

DPRRTSPAGRPGPARRPPRRTPRPARGRHP 

GR*GGPGASRPGGTGAAPAADQTGSPAVS 

TPSEFGAPGQAEGPQSPERASARSHLSCTA 

WLGKPSKPSAQRQPTVGPDGDRDGSSQAP 

NLSRGQAWRASLASPQNTSATGRVTCHGQ 

STWPLCRIJCSNRRRKSGFA/GNKSEPVGLT 

RRSKHQPRNPQGQVGI 
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Table 8 



SEQ 
ID 

INU: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

nii/iIaATI/IO 

nucicuiiue 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /=possible nucleotide 

deletion =nossihle nucleotide insertion) 


zoo/ 


A 

A 


117 




MYTVSLLLCLFFKKSDPDPGPFONM^HNH 

GTQSQSCMGSKVGDVIPGAARLISETAQRV 

HTIGQKQKNDQHLRRVQALLSGRQAKGLT 

SGRWFLRQGWLIVVPTHGEPRPRMFFLFT 

DVLLMAKPRPTLHLLRSGTFACKALYPMA 

Q 


2868 


A 


438 


2 


TQRLVISEPDGEILTPGWDTQDRMGVESRT 
NIQELGNRNQREAGGENLPETQAHMGETQ 
DQLRCODAETQTPEWENQDKNGSEDAVE 
TQTFEKKDKKEAGEEDGEEIQAQGLGKQG 
OTfrDFNGFFTOTRVLRALETIPASS 


2869 


B 


1 


390 


MTPKHDHLGHVLPISLQLLLELSSCLPAAS 

AVWCAGCNDPWMTGYPDNMHYNYKPML 

HDRGGSAVTLSASQSWYAGCNAEKSEVN 

AFPGTQGMRFISAASYKDWVQVLQQKDV 

SRNMGTKARSASSLKN 


2870 


A 


1 


3411 


MMEGEGGVRMSHDQTGNKRKHGTSGISV 

CPNLLLLQEYQPDYIRAHASGLNLISSSKAL 

PKYSHVLSGLCKICSFGPRFSLHSDTFFFAL 

FAHADPEQIRNCETPAPPLQTERKNEMRIK 

THPSSSPLYDTPGRPAGSDDSSSRGRAGAL 

STFLEPQPvPRTHLSLELHRPSPGPRLSLPLFT 

KPSFLGSGRREHAEERARGPRETAAVAAR 

AEQGRGGSHSHSSALGAPRRVAMLPGLAL 

LLLAAWTARALESLENRSAAGGCRKEMN 

KGNDNGALAIGGNMVnWVDDFGWYVDR 

DTLEQGSPTPSHGQVLVHGLLGTGPHSRST 

LNIKEQLPRSKISSIGACNIIFQVDINAIFGIL 

MWTDGNAGLLAEPQIAMFCGRLNMHMN 

VQNGKWDSDPSGTKTCIDTKEGILQYCQE 

VYPELQITNWE AN QP VTIQNWCKRGRKQ 

CKTHPHFVIPYRCLVGEFVSDALLVPDKCK 

FLHQERMDVCETHLHWHTVAKETCSEKST 

NLHDYGMLLPCGDDKFRGVEFVCCPLAEES 

DNVDSADAEEDDSDVWWGGADTDYADG 

RTS AIFG YDHD CKVHD AFALS S VLVDRQE 

WGSTYESGAGQGIAAFWGACWKEEQSLL 

FLLPDMDWLCIJISMNFNYISQNSHMLWR 

DPGEEDSKKLSALSSLPGIVLALGKAQRILLI 

ELLGVGLESEDKWEVAEEEEVAEVEEEE 

ADDDEDDEDGDEVEEEAEEPYEEATERTT 

SIA'nTl'l'lTESVEEWREVCSEQAETGPCR 

AMISRWYFDVTEGKCAPFFYGGCGGNRN 

NFDTEEYCMAVCGSATNCTFDLKKSWSSG 

GQIQMADSIQRKGAELEAICQKRFSQRKHR 

YGKCFVGVLAPVMEEHFVIGTLGAASPFM 

NKLKA^CYFTPENRALAVPTTAASTPDA 

VDKYLETPGDENEHAHFQKAKERLEAKHR 

ERMSQVMREWEEAERQAKNLPKADKKAV 

IQHFQEKVESLEQEAANERQQLVETHMAR 

VEAMLNDRRRLALENYITALQAVPPRVGL 
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Table 8 



SEQ 

ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
oi peptide 
sequence 


Predicted 
ending 
nucicouuc 
location of 
last amino 
acid residue 
ox pepuue 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /^possible nucleotide 

Hplptinn =nns<iihle nucleotide insertion) 










AAAEFTLQVTAQTPRHVFNMLKKYVRAE 
QKDRQHTLKHFEHVRMVDPKKAAQIRSQ 
VMTHLRVIYERMNQSLSLLYNVPAVAEEI 
ODEVAFXINKNMNYYKPDAGKISG 


2871 


A 


18 


382 


GKMPPHLAMGCPPRLNPWEQPELGARGR 

GDGCPCPAEHGWALDVRYS*LPLPQSLASS 

LATPPQVFCSFTLSSKSPRPAARQETPAGAP 

PAGPSFAGRRRTIPGSGAPRRSPGGRROEO 

LR 


2872 


A 


673 


941 


CCLAAHSGPPAQGQRRGPG*LCCSAGSGG 
NL*S*AGGPG*GRSGQPVCPPWPGPGAPGH 
RPALPGSGGSSAVGRSAVPGAVRSPSHAG 
W 


Ann 1 ) 

2873 


A 


11 1 


/ 11 


AT T F<?T <?«?nFAOAWGAPRLVAGIRLIEHKC 

VLGGGTAGAWG*KDQVTIQPAGHAPGLSG 

TEATVTPDDSVSDPTTWPSQEVSMCHPLPG 

SHPSHLLKEGMTSVRPRALQQGPPWQLQT 

KDSAPPP*TPASFSPFFPLSPLPVSPSLSHTH 

SFRVOGAKRFA 


2874 


A 


1942 


932 


ARVRWRPPRWPPRASCPGPALRLCRGGSM 

GGPRGAGWVAAGLLLGAGACYCIYRLTR 

GRRRGDRELGJRSSKSAEDLTDGSYDDVL 

NAEQLQKLLYLLESTEDPVnERALITLGNN 

AAFSVNQAHRELGGIPIVANKINHSNQSIKE 

KALNALNNLSVNVENQDaKIYISQVCEDV 

TjQr;pT MCAVDT AOT TLLTNMTVTNDHOHM 

LHSYITDLFQVLLTGNGNTKVQVLKLLLNL 

SENPAMTEGLLRAQVDSSFLSLYDSHVAK 

EILLR\T.TLFQNIKNCLKIEGHLAVQPTFTE 

GSLFFLLHGEECAQKIRALVDHHDAEVKE 

KWTUPKI 


2875 




1 
1 




MARNRCVT3GOPGHLVDFTCLVTYRVSGES 

RAPOTMAEl^LWYH^ 

VEEKGPCICKJy^PNSWQRDAl^KJEMLQ 

QLQNM>TKQVLPSKASA1TI^ 

PDGSGGEKIDFLHTRTTPPPLL^ 

iNflCTVYRSOTTOT 


2876 


A 


1573 


2858 


EPWEQAJDQRSSTDTSLSTPAAPMVDSLIA 

RVGVMARGNAITLPVCGRD^ 

DSVEKTSRVWSGNERDQELLTEDALDDLEP 

SFLLTGQQTPAFGRRVSGVffilADGSRRRK 

AAALTESDYRVLVGELDDEQMAALSRLG 

NDYRPTSAYERGQRYASRLQNEFAGNISA 

LADAEMSQ*ICWKYTC 

RCESfTAKlPKSWALFSHPGELSARSGDAL 

QKAITOKEELLKQQAS^ 

PEEVITLLTSEIKTSSASRTSLSSRHQFAPGA 

TVLYKGDKNflTTV*^^ 

L1VILGTVTU)AVGIGLV1V^ 

SDSIASHYGVl^ALYALMQFLCAPVLGALS 

DRFGRRPVLLASLLGATTO 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucieouue 
location OI 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /^possible nucleotide 

HMpfinn =nnssihle nucleotide insertion) 










IYPLVNSPSC 


2877 


B 


448 


3506 


XALMD3IDGGESWSFMDDNQNKTHDKKE 

KKMWQKPHGTMEYTAGNQDTLNSIALK 

F>nTPNKLVEL>nCLFTHTrVPGQVLFVPDA 

NSPSSTLRLSSSSPGATVSPSSSDAEYDKLP 

DADLARKALKPIERVLSSTSEEDEPGWKF 

LKMNCRYFTDGKGWGGVMT/TPNNIMF 

DPHKSDPLVIENGCEEYGLICPMEEWSIAL 

YNDISHMKIKDALPSPGEWEDLASEKDINP 

FSKFKSINKEKRQQNGEKIMTSDSRPIVPLE 

KSTGHTPTKPSGSSVSEKLKKLDSSRETSH 

GSPTVTKLSKEPSDTSSAFESTAKENFLGED 

DDFVDLEELSSQTGGGMHKKDTLKECLSL 

DPEERKKAESQINNSAVEMQVQSALAFLG 

TENDVELKGALDLETCEKQDIMPEVDKQS 

GSPESRVENTLNIHEDLDKVKLIEYYLTKN 

KEGPQVSENLQKTELSDGKSIEPGGIDITLS 

SSLSQAGDPITEGNKEPDKTWVKKGEPLPV 

KLNSSTEANVIKEALDSSLESTLDNSCQGA 

QMDNKSEVQLWLLKPJQVPIEDILPSKEEK 

SKTPPMFLCIKVGKPMRKSFATHTAAMV Q 

QYGKRRKQPEYWFAVPRERVDHLYTFFV 

QWSPDVYGKDAKEQGFWVEKEELNMID 

NFFSEPTTKSWEIITVEEAKRRKSTCSYYED 

PDFPVT PVT RPHSALLENMHIEOLARRLPC 

KGYPWRLAYSTLEHGTSLKTLYRKSASLD 

SPVLLVIKDMDNQIFGAYATHPFKPSDHYY 

GTGETFLYTFSPHFKVFKWSGENSYFINGD 

ISSLELGGGGGRFGLWLDADLYHGRSNSC 

STFNNDILSKKEDFIVQDLEVWAFD 


2878 


A 


226 


2263 


SVKOTTKCHVRIsfEQIOsTKLTSCKSCSLNL 

NCQWDQRQQECQALPAHLCGEGWSfflGD 

ACLRVNSSRENYDNAKLYCYNLSGNLASL 

TTSKEVEFVLDEIQKYTQQKVSPWVGLRKI 

NISYWGWEDMSPFTNTTLQWLPGEPNDSG 

FCAYLERAAVAGLKANPCTSMANGLVCE 

KPWSPNQNARPCKKPCSLRTSCSNCTSNG 

MECMWCSSTKRCVDSNAYnSFPYGQCLE 

WQTATCSPQNCSGLRTCGQCLEQPGCGW 

CNDPSNTGRGHCIEGSSRGPMKLIGMHHN 

EMVLDTNLCPKEKNYEWSFIQCPACQCNG 

HSTCINNNVCEQCKNLTTGKQCQDCMPGY 

YGDPTNGGQCTACTCSGHANICHLHTGKC 

FCTTKGIKGDQCQLCDSENRYVGNPLRGT 

CYYSLUDYQFTFSLLQEDDRHHTAINFIAN 

PEQSNKNIJDISINASNNFNLNrrWSVGSTA 

GTISGEETSrVSKNNIKEYRDSFSYEBCFNFR 

SNPNITFYVYVSNFSWPIKIQIAFSQHNTIM 

DLVQFFVTFFSCFLSLLLVAAVVWKIKQTC 

WASRRREQLLRERQQMASRPFASVDVALE 

VGAEQTEFLRGPLEGAPKPIAIEPCAGNRA 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 

miolentiffe 

UUvlCUUUC 

location of 
first amino 
acid residue 

AV1U 

of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










AVLTVFLCLPRGSSGAPPPGQSGLAIASALI 
DISQQKASDSKDKTSGVRNRKHLSTRQGT 
CV 


2879 


A 


1 


1131 


MKVTFANKPEGGGRLAKQRPPGRGARPRP 

KHEGGQSVLGTRRPALLQVSCTDVSLSEQ 

DKDGATATHFAASRGHSKVLSWLLLHGG 

EISADLWGGTALYDAAENGELGCCQILW 

NGAELEVRDRDGYAAADLSDFNGHSHCT 

HCLRTVENLHRGMVLALGAAEHSKAQRP 

EAAGGPEGELPPEKESLEENEWPSRGQGLV 

PSAPTAVAQSMEHCVLSRDPSVELEAKQP 

DSGMSSPNTTVSVQPLNFDLSSPTSTLSNY 

DSCSSSHSSIKGQHPPRAPNPQILQYKKRFS 

ELEQLLERSGELEQQQLRDAEHSQDLESAL 

IWLEEEQQGGPGLAAWPPGRAPTDPLCPIQ 

ECQPGPGECHALRTAGPGRFGQPGSE 


2880 


A 


1 


416 ! 


FRTDARVAITIYYQATEEFQNGIASYIPKDN 
SLQSETVQYKRGVCQQFCLPSHTVDPSEW . 
AEEELGFDLDREVYPLWHAWDEGDEYF 
GHCHVLLGTFEKHTDGTFCVKPLKQKQW 
DGVSYLLOEIYGIENKYNTQ 


2881 


A 


419 


1 


KYFKCAPFPPATRPKAHTVFLKNVDIQVNL 

RFCSKVAKLHYPNNLLFHSLGITKMQLDR 

KELAWQSHSGSKGRDLFSPSLPALEQLRVP 

LEEHSASPDPIHPPSLAPERAASPGPPTGAE 

TRVPAPHAGTDPSEPPRR 


2882 


A 


2 


366 


ARPRWLJCRLGSQRELAQLGPEHLQAGHR 

PAPLRPAAGHAPDRVRAPQRRRASAHARG 

SGGLVGPGALPLAAPSRPPGAPLRGDQGL 

GQLPASQPQGLGAHAAAADPGLQPRAAG 

ATEFSV 


2883 


A 


3 


1396 


RQENNTRGVPSLLKSFLQERLGIHLIRRKIV 

KPKHHVLMSRKESWKVKSEIPKVPKQPLV 

LHHPRJvrrTTKSPSKDMLEPEAELAEDLPTT 

KSTSVES/EDAH*EPGRPFPVLPDL/PCHCLP 

SAPTPLCIVKRPCPT*VTQLSASAQSAHQM 

RTPRAQSPSS*PR*VNCLPPS/LHKDDLELK 

EKDQKKPPTAPREVKGTRRKLPTAFLPSKY 

HGYEELLTAKPDPAFEBPKGIQKNA/PSPAT 

NAEAPTPVPLLQAQAGHSSETLCSQRETGP 

ENPDSTPKED*SPTSG*HLHSLAGSPEHYRG 

STRCCPAPVDRTAAGEP/ASSTWRPRGC*R 

SSRHVTGSW*VALCAQCSGLPRSPWPAQR 

*VRASPSSATSSSSWMSSARSPQPVTHKAR 

AVHGGCVHHPACAPALPEGSVPWTAPQG* 

PAGHRPQSSAGPHLLATRWHPLVRISPPWP 

RHDLVPGPAAIKSGCTGQ 


2884 


A 


437 


748 


MLIGLLAWLQTVPAHGCQFLPITSVTATVY 
HLPVHQLKGRSRVQKNLTLDNEGEGTWTT 
CLEFLESLAGWRLGWGVSRGVREWLCLQ 
OVSLHQTPGLPHKODL* 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 
nucieu uu e 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

mi <*1 pftti df* 

IIUCICUIIUV 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *-Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 


2885 


A 


1696 


2394 


ERSTYDLRSSDRPAQETSHQFQIHLPCVLLL 

YSPTLTLKYISTPSLATDHAPLTISLKPNHP 

YPAQCQYPEPQHALKGLKPAITRLLQHGLL 

KPINSPYNSPDLPVLEPEKIYRLVQDLRLINQ 

IVLPIHPWPNPYTLLSSIPPSTIHYSVLDLK 

RAFFTffLYPSSQPLFAFTWTDPDTLQAQQI 

TWAVLPQSFTDSPHYFSQAQISSLSVTYLSI 

ILIKTHTLSLLIMSD 


2886 


A 


377 


3 


TPAWMTERDCIWRPvRTSAPGGSWPSGPVP 

SPGAQ*RPPSQGLGLWWAAAAAPRC*TAP 

GPRPPPHGPGSPQGASPPTRPPRCRPHPRA 

GSAGPTGATPPGSTQGQRRRHSHQLPGHP 

GHRVALG 


2887 


A 


1162 


536 


HILRRQEFFFFCLFVCLRWVLVLLPRLE*CG 

MHAHCNLFLLGSSNSPASAS*VAGTTGVR 

HHAWUFCILVETEFHRVAQTDLELLSSGNP 

PASAS*SAGIIGVSHSAWPESCRYARRKCF 

CVKKLRRWKLNPLCIQKAVSEGHCWQASP 

YRDSAVREQSIWGTTASSGGARMRWSSPA 

ALYVRLLAGFSFINKLVASEYRVFSSTL 


2888 


A 


128 


2626 


NSHRWVYVRARRWRRRGKQREQPEDRGV 

PMKRAAMALHSPQY1FGDFSPDEFNQFFVT 

PRSSVELPPYSGTVLCGTQAVDKLPDGQEY 

QPJEFGVDEVIEPSDTLPRTPSYSISSTLNPQ 

APEFELGCTASKITPDGITKEASYGSIDCQYP 

GSALALDGSSNVEAEVLENDGVSGGLGQR 

ERKKKKKRPPGYYSYLKDGGDDSISTEAL 

VNGHANSAVPNSVSAEDAEFMGDMPPSVT 

PRTCNSPQNSTDSVSDIVPDSPFPGALGSDT 

RTAGQPEGGPGADFGQSCFPAEAGRDTLS 

RTAGAQPCVGTDTTENLGVANGQILESSG 

EGTATNGVELHTTESIDLDPTKPESASPPAD 

GTGSASGTLPVSQPKSWASLFHDSKPSSSS 

PVAYVETKYSPPAISPLVSEKQVEVKEGLV 

PVSEDPVAIKIAELLENVTLJHKPVSLQPRG 

LlhnCGNWCYlNATLQALVACPPMYHLMKF 

ffLYSKVQRPCTSTPMnDSFVRIMNEFTNM 

PVPPKPRQALGDKTVRDIRPGAAFEPTYIYR 

LLTVNKSSLSEKGRQEDAEEYLGFILNGLH 

EEMO^IJKKLLSPSNEKLTISNGPKNHSVNE 

EEQEEQGEGSEDEWEQVGPRNKTSVTRQA 

DFVQTPITGIFGGHIRSWYQQSSKESATLQ 

PFFTLQLDIQSDKIRTVQDALESLVARESVQ 

GYTTKTKQEVEISRRVTLEKLPPVLVLHLK 

RFVYEKTGGCQKLKNIEYPVDLEISKELLS 

PGVKNKNFKCHRTYRLFAVVYHHGNSAT 

GGHYTTDVFQIGLNGWLRIDDQTVKVINQ 

YQWKPTAERTAYLLYYRRVDLL 


2889 


A 


1669 


1338 


FRRPRRANRFRSRIRNQPGPHGETPFFL*IP 
KLARHGGG/CP *SPLLRRVRPENPFNPGSRG 
1 FN*LKPQPCPPTWVTE*DSVSKTNKOPPPT 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
ueieuon^^possiDie nucieouae inseruuii; 










V XTKTD fTB W A TV A/T? Q O \AV TW ^ 
JsJSJN Jxl/vJJv W OA! W HoV^lVUD 1 W O 


2890 


A 


807 


369 


GKGGGGQTRRCARPGRHHAAPALRADRT 

GPAPRRGLFGRCRTLQPSARRLSSEHSV*Q 

THGCATPSRCHGGDGREDRGSPGDRGERP 

AGPAGGAGLEPAPGTLQPRSRPSRRWLLSP 

GAGAQQLEWHLPGQRPONOPCPLDFLP 


2891 


A 


1204 


2 


FPFPVPPPLFTDPRAPQPHRHLAFRGHRKE 

KGPGDPPSTPQSQ\ADPAAAPQGQPGC/RLP 

RGHCDRRHQEARPGCWGPP\GGPGSILGPK 

SWCHLEADSGKRPGWTVGVGVRSSPACP 

GH/VEQQGSAGSPGWMGWGCPCPVS*PLQ 

GQNQPSPSSLGGSRGSFFSPPDPA/GGQGQE 

GEGRGERSGQGPWGPGSFKNA/RQVAGGG 

QEGGQGPDPHDGGSLRPPRMKEGGLGRRG 

T>nrvDc\/ r rD\/T a a u wqv a ppQnrjnnwPT 

RrQJroV 1 JrVLuoAAKW &]SJ\jrroK^KJ\^urus. L 
GGNRHIJ^P*SSGGRGGAPGALGL/PWHPA 
CSGASGHSGRWA*RSSGWG*GPSPHTPPPG 

AGVKISLLLGGERGL/PGPLAWHDSGDGG 
AGHRGGV*S*RS\PPDPLSLSPRPAA 


2892 


B 


74 


325 


SAFSYIPPRRLDPTEHSYYYRPAREQERPA 

GVLTSSVYGKRINQPffiPLNRDFGRANHVQ 

ADFYRKNDIPSLKEPGFGHIAPS 


2893 


A 


1 


3426 


MAGGQEVEAWADQLCAKYSKEYGKLCR 

TNQIGTVNDRLMHKLSVEAPPKILVERYLI 

EIAKNYNVPYEPDSVVMVEDILEMSLVEFG 

NIGEAFLEQNQSPESSVTLTSANATLLLSRQ 

NISTLPLSSYTLGHPAPVRLGFPSALALKEL 

LNKHPGVNVQVFALDPVLGTFULTSVILM 

VLWINLFVSAILMAFGKERKSLKWMQS 

NTICYRENRISTVPPSGTRETARKAKGHRG 

LPENPVQLSEAFNCQDKLCNWIPVGQCPA 

ARSTVYANERAQLPGTVTMASRVIFPLPLA 

FESLHTPGKSSSQGSDAGAGPPILGLFCPW 

TRGPRI5AIJIARRLSSPIADVNKNIPPSKHR 

TDJ3SRPDGSIIJFLPPFFWTITPPARADVQE 

KDGHTffiQDEGERQHQIEKTEEENTNKPKR 

KQKLAPGTPQSNMKPVHERSQECLPPKKR 

DLPVTSEDMGRTTSCSTNHTPSSDASEWSR 

GVWAGQSQAGARVSLGGDGAEATTGLTV 

DQYGMLYKVAVPPATFSPTGLPSWNMSP 

LPPTFNVASSLIQHPGIHYPPLHYAQLPSTS 

LQFIGSPYSLPYAVPPNFLPSPLLSPSANLAT 

SHLPHFVPYASLLAEGATPPPQAPSPAHSF 

NKAPSATSPSGQLPHHSSTQPLDLAPGRMP 

IYYQMSRLPAGYTLHETPPAGASPVLTPQE 

SQSALEAAAANGGQRPRERNLVRRESEAL 

DSPNSKGEGQGLVPWECWDGQLFSGSQ 

TPRVEVAAPAHRGTPDTDLEVQRWASQV 

GPQSTHJITQCLCNHLTFFASDFFVVPRTV 

NVEDTKLFLRVTNNPVGVSLLASLLGFYV 
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Table S 




SEQ 
ID 


Method 


i Predicted 
beginning 

DUU6UUUC 

location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










ITWWARKKDQADMQKGCQTPAGVHPPA 

PQLEEAGTIPSGQLVKVTVLADNDPSAQFH 

YLIQVYTGYRJRSAATTAKLSVYLILPGCRT 

RTRDPLSGVGSRPVAGAEYRLPGQFGRTST 

VAASNTQAEGAAGHRGFWLAKQHPKDAV 

TLELRCTPCRSIARLSDAGGVPAGAPvRVRC 

AAVLANCSLDMKRGVCASRSATVRKRSD 

KDVEELGDRESAVGVSDFLDGDAHYERN 

GNNSHLYQRHKKTKRGVAIARDKMPPDF 

QDHVffGQEIKAKSFYSPVDSDETGDKIRY 

NSKRRHWRTGMLGL 


2894 


A 


3 


30 


ENFQHFMDRISNGGLEEGKPVDLVLSCVD 

NFEARMTINTACNELGQTWMESGVSENAV 

SGfflQLIIPGESACFACAPPLWAANIDEKT 

LKREGVCAASLPTTMGWAGILVQNVLKF 

LLNFGTVSFYLGYNAMQDFFPTMSMKPNP 

QCDDRNCRKQQEEYKKKVAALPKQEVIQE 

EEEHHEDNEWGIELVSEVSEEELKNFSGPV 

PDLPEGITVAYTIPKKQEDSVTELTVEDSGE 

SLEDLMAKMKNM*ISWIE 


2895 


A 


1 


2369 


AGGARLRPARGRPPRLLPPRPGPCRPPPVP 

APTVNERRAPPRAGWERRSDAGLSRGARP 

AEMYGVCGCY GALRPRYKRLVDNIFPEDP 

EDGLVKTNMEKLTFYALSAPEKLDRIGAY 

LSERLIRDVGRHRYGYVCIAMEALDQLLM 

ACHCQSINLFVESFLKMVAKLLESEKPNLQ 

ILGTNSFVKFANIEEDTPSYHRSYDFFVSRF 

SEMCHSSHDDLEDCTKIRMSGIKGLQGVVR 

KTVNDELQANIWDPQHMDKIVPSLLFNLQ 

HVEEAESRSPSPLQAPEKEKESPAELAERC 

LRELLGRAAFGNKNAIKPVLIHLDNHSLW 

EPKVFAIRCFKIIMYSIQPQHSHLVIQQLLG 

HLDANSRSAATVRAGIVEVLSEAAVIAATG 

SVGPTVLEMFNNTLLRQLRLSIDYALTGSY 

DGAVSLGTKIIKEHEERMFQEAVIKTVGSF 

ASTLPTYQRSEVILFIMSKVPRPSLHQAVDT 

GRTGENRNRLTQIMLLKSLLQVSTGFQCN 

NMMSALPSNFLDRLLSTALMEDAEIRLFVL 

EILISFTORHGNRHKFSTISTLSDISVLKLKV 

DKCSRQDTVFMKKHSQQLYRHIYLSCKEE 

TNVQKHYEALYGLLALISIELANEEVYVDL 

BRLVLAVQDVAQVNEENLPVYNRCALYAL 

GAA YLNLISQLTTVP AFCQHIHE VIETRKKE 

APYMLPEDVFVERPRLSQNLDGWIELLFR 

QSKISEVLGGSGYNSDRLCLPYEPQLTDED 

RLSKRRSIGETISLQVEVESRNSPEKEEVSV 

RATVLGQPHLL 


2896 


A 


1575 


1968 


REMGFRHVGQTGLELLTSGDLPTSASQSA 
GITGVSHHTWPKTLFVLRQSLTLSPGLECS 
GTISAHCSPHLPCSSNSCAPASRVAESTEAH 
H/LCPDNLfflSSREGASPCWPGCS*TPELKR 
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Table 8 



SEQ 
ID 

NO* 


Method 


Predicted 
beginning 

11 III 1 CU 11 uc 

1 a ration ivf 

first amino 
acid residue 
of peptide 
sequence 


Predicted 

ending 

nucleotide 

MA UVlwU UU V* 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Un known, *=Stop 
codon, /=possible nucleotide 
deIetion,=possible nucleotide insertion) 










PAHPCRDQLOH 


2897 


A 


524 


954 


FCSMSSQKWSWQAQPLSWRHWSQGPVPS 
LPAKLLFKGFLPGTAKPACSAFREAAALAF 
IQDNKTAISEEKGNGSRFLGFPSARLRGRPR 
AESPRPEPPvARPRATQPGPAAPAAHATPPP 
GP AP AP YLVIRG ASGGRGNVRGPK 


2898 


A 


188 


590 


DLHFEIQVLLEALRGLCSLYPKHREGSLKV 

HPGHLCWMPTVTRPGTPPSQASTGAQELP 

GGEKKTCRWEKKKKTFPGSAGLTGKSIER 

LTRPALYLRPLLFSSFPVRVTLEALPGGVPK 

RSASRMPVEMKRGPF 


2899 


A 


41 


274 


KRGTERKTHFGGCSIQFSDIASGKNILPGLC 

FLTHKR\WFCSL*RQGWVSRWSHE*GCTR 

CWRLGKFLWVADRFLGSG 


2900 


A 


1 


1462 


MKAMPWNWTCLLSHLLMVGMGSSTLLTR 

QPAPLSQKQPJ3FVTFRGEPAEGFNHLVVDE 

RTGHTYLGAVNRIYKLSSDLKVLVTHETGP 

DEDNPKCYPPRWQTO^PLTTTNNWKM 

LLIDYKENRLIACGSLYQGICKLLRLEDLFK 

LGEPYHKKEHYLSGVNESGSVFGVTVSYSN 

LDDKLFIATAVD GKP E YFPTIS SRKLTKNSE 

ADGMFAYVFHDEFVASMDCIPSDTFTIIPDF 

DnfYVYGFSSGNFVYFLTLQPEMVSPPGST 

TKEQVYTSKLVRLCKEDTAFNSYVEVPIGC 

ERSGVEYRLLQAAYLSKAGAVLGRTLGVH 

PDDDLLFTVFSKGOKRKMKSLDESALCIFI 

LKQINDRIKERLQSCYRGEGTLDLAWLKV 

KDIPCSSAIRVDGPRGNALQYETVQWDPG 

PVLRDMAFSJCDHEQLYIMSERQSQELCPPQ 

ELDDIFSCCQTPRSPDFSHTGTHCALDEAA 

MAWEWSHSQ 


2901 


A 


14 


348 


GLFPNKBPFSVLEIRTWAHLSGRHHSAHCT 
SCAWPQVACLPLATHPSCTCTFCSLQAPGR 
PGQSPLSPRRACGPEDLPPPPYV*DLAPSLG 
PSLGPLMSQSQPRRTPPLRG 


2902 


A 


191 


1375 


EWPEGGGRYSSVPSAVHHARTCLAAELSG 

TSRPQEPRALPPETGVATAEAEKSNQPAAI 

SKNPNGQGAPLQR/RSPRLSPSPGAAQVPAL 

PMQDMSEGSSSPSPPGGHIWLASLTPCSLA 

LWNSCCQSPGSQPRGRDEGDCLVRATEPS 

ATGPDPRRTRLCSISASLWRNTPDPGISDR 

RPGISDRRPGTSDRRPGTSDRRPGISDRRPG 

TSDRRPGTSDRRPGTSDRRPGTSDRRPGISD 

RRPGTSDRRPGISDRRPGTSDRRPGISDRRP 

GTSDRRPGTSDRRPGISRLPRDWIPAAAAS 

RENSNSADARNRCSSPSRKCQTPTSHRMR 

GSAGSVGSSAGHTAGGTGLPTPSRCSQAL 

QVFPAVLGKRGFLSWERSLKQRDIRGPDFS 

STALI 


2903 


A 


1 


2547 


MRKYNSLWDMRKVSWWIDQASHNIPLS 
QSQIQERPFNSVKAERGEEATEEELEANTAS 
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Table 8 



SEQ 

m 

NO* 


Method 


Predicted 
beginning 
nucleotide 

11 UV1VUUUV 

location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion 5 =possible nucleotide insertion) 










GCASRLHSYLLAIJICFTVKLCGVPSPHLFA 

SSTASLPESPGCCMHSLVTKSPCGDPLEPD 

DATLFKQNIJFYLETLNTKQKLYHKiaFRTA 

MLFQFVKVHLLQVLVIIKSHDLLQEEIGIAIY 

NMASVDFDGFFAAFLPEFLTSCDGVDANQ 

KSVLGRNFKMDRVCEGISSLSRLQNELSYI 

EKFTDFLRLFVSVHLRRIESYSQFPWEFLT 

LLFKYTFHQDLDIQPSQAVFGGIEFTYILVT 

LVDLGTQRVPKPGCGQGGRANCPNSGANA 

TANGTAAPAAAAAAATAYGERPTWRRAD 

TAGRPATNASASGFPHRIELKAGKTITLED 

GRQINGADYLAAPVPGKALAIFGDTGPCD 

AALDLAKGVDVMVHEATLDITMEAKANS 

RGHSSTRQAATIAREAGVGKLHTHVSSRY 

DDKGCQHLLRECRDFKATRPNEKWVTDV 

TEFAVNGRKLYLSPVIDLFNNEVISYSLSER 

PVMNMVENMLDQAFKKLNPHEHPVLHSD 

QGWQYRMRRYQNILKEHGIKQSMSRKGN 

CLDNAWECFFGTLKSECFYLDEFSNISEL 

KDAVTEYIEYYNSRRISLKLKALAVALANI 

DPHELTSCADACKRTALVANPWQLGNVR 

DARTYKELLDQIAELLRELGSADRLMEVIR 

EELELVREQFGDKRRTEITANSADINLEDLI 

TQEDVWTLSHQGYVKYQPLSEYEAQRRG 

GKGKSAARKEEDFIDRLLVANTHDHILCF 

SSRGRVYSMKVYQLPEATRGARGRPIVNL 

LPLEQDERITAILPVTELGIL 




A 


165 


638 


MFVIAFI^PLSLIFLAKTLKKADTRDSRQAC 

IAASIALALNGVFTNTIKLIVGRPRPDFFY 

RCFPDGLAHSDOvlCTGDKDVVNEGRKSFP 

SGHSSFAFAGLAFASFYLAGKLHCFTPQGR 

GKSWRFCAFLSPLLFAAVIALSRTCDYKHH 

WOGPFKW* 


2905 


A 


1 " 


2301 


MGWDCGLARWARVGLRERAAVQPLAPG 

CAAMSFAFPPFIPQGYKTAFGVGTNKIVTQ 

DNRWELPGAWYFPRASSQAREMPQCPTLE 

SQEGENSEEKGDSSKEDPKETVALAFVREN 

PGAQNGLQNAQQQGKKKRKKKRLGLKAG 

EWGAMLMIGDQSIQLPAFLSSIVRRAAQQ 

YGFREGGEDDDWTLYWTDYSVSLERVME 

MKSYQKINHFPGMSEICRKDLLARNMSRM 

LKMFPKDFRFFPRTWCLPADWGDLQTYSR 

SRKNKTYICJ^DSGCQGKGBFITRTVKEIKP 

GEDMICQLYISKPFIIDGFKFDLRIYVLVTSC 

DPLRIFVYNEGLARFATTSYSRPCTDNLDDI 

CMHLTNYSINKHSSNFSRDAHSGSKRKLST 

FSAYLEDHSYNVEQIWRDDBDVIIKTLISAH 

PnRHhmCTCTPNHTLNSACTEILGFDILLDH 

KIJ^WLLEVlSnHSPSFSTDSRI^KEVKB 

LYDTLVIJNI^SCDKKKVLEEERQRGQFLQ 

QCCSREMRBEEAKGFRAVOLKKTETYEKE 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 

nil r»l ArttiHp 

nucici/iiuc 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

nnrlpntiHp 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deIetion,=possible nucleotide insertion) 










NCGGFRLIYPSLNSEKYEKFFQDNNSLFQN 

TVASRAREEYARQLIQELRLKREKKPFQM 

KKKVEMQGESAGEQVRKKGMRGWQQKQ 

QQKDKAATQASKQYIQPLTLVSYTPDLLLS 

VRGERKNETDSSLNQEAPTEEASSVFPKLT 

SAKPFSSLPDLRNINLSSSKLEPSKPNFSIKE 

AKSASAVNVFTGTWSILEAEKSKIKVLAS 

LMSGEGLFLIDGSFLLCPHTVEGAS 


2906 


B 


1 


1518 


MVNTERQLDWIERCQVLILALSEEINPELPE 

AIVMASSEWTRQDNIDSPQEPPPTPLFASR 

PVTRLKSWRAPRVRPVGPRTHPVVISPVPE 

CnSIDELRSWQNPHIGTLTGRVRAVMVRKA 

KWKPLELSLPRKTVNQKQYCVPGGIVEISA 

TTKDLKDAKVVIPTISLFNYPrWLVQKNDG 

SWRMAVDYHKLTQGVTPIAAAVPNVISLL 

EQINTSSGTWYAAIYLVNVFFSIPVHKALK 

KQFAFSWQGQPYTFTILPWGHINSPTLCYN 

UWPJSLDHFSLPQDITLVHYIDDIMUGSSE 

QEVANTLDLLEKALQQVQAAVQAALPLGP 

YDP ADP WLE VS VADRDTVWSLCSCC YTP 

WFGTLSHVSNLQTWSPCPPPVSPVGSQRPQ 

LSREKNKNTKRIHSIPEVLIMKPYFTAVAKP 

SLLSHKWLPLEKPENPCCYSSDHRTAVPNL 

LLYRRSTRRKTELTNKELTSAHFTGDLPRR 

AVWVLGDRTAVRPSLEOGMALWI 


2907 


A 


2 


266 


KGSTEAFISGTAGWGTGLLPSSAGLPGGW 

GPAGGWAGTDRRGPRARPIPQKSPPWPWS 

GDAAKGQSGFLPVAAWAGQGRLPGGGirV 

H 


2908 


B 


494 


641 


MADLEQLGLNPGLEGTHHLHHPGHMGAK 

LDKQHPHDRVPTRKSDPACGMGTAVAHH 

IAPGWLRAAVTQTPFKFCQWKLCSCVNIA 

GDSFSPWYGGISVAHPEPTVTASPTTQGSA 

LPPGEENPSEWLCAFSKREAQYEHSLRPL 

KEDRTVYRVGPNKRGKRRTVLKHMQWKL 

IKGAYRRGQLLANNQAEHKVVSRKINQDC 

FTLEGGTAWKQHALSESSRHALAQFFTVMH 

LPAQPGALRAPLLLTLAALVHVGVQSRGS 

RSRFLGCLEPIERSFLGVLPRSWERSVLCLP 

VNSLQGACLRLPAAADSSEFKRS 


2909 


A 


149 


300 


TRRGGCPEEKVEELKLWEKCVHSLYRHSS 
SALDLQOPGAIY1PSGFPLR 


2910 


B 


312 


466 


MGQVWVLVHSTLEPFHTNNEEEAKYNEV 

TEEVTEQVCLPAKANAAKEKEVHPYPSAP 

LNYFEEKEWPDPPDLSFLEDTGGDPSLTSH 

WQLTKEAEAELQLIEKQVHKAQINRIDPEK 

IPDLLIFSTQHSPTGVIVQEQDLVEWFFLPH 

TDSWTLTPYDDQITTMIGIGRTOIVKLHGY 

DPGKIIVPLMBCAQIQQAFINSLTWQTHLAD 

FVGILDNHFPKMKIFQFUCLTNaLPKITKF 

KPIEGAENVFTDGSSNGKASYFGSKRKVFQ 
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Table 8 




SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










TPYTSAQKVELVAVIELLTAFDMPINVISDS 

SYVVHSTQLIENAQLRFHTEEKLMTLFTQL 

QTAVRSRMHRFYITHIRAHTHLPGSLTEGN 

QMADRLVATAVSNARHFHSLTHVNASGL 

KHRYSITWKEAKAIIQRCPTCQVVHSSSFT 

VJVJ V lir Kt Tl JjliluJ t YV XilYXX^ V X XI V X OX vJX\JL*^k A 

VHACVDTFSHFVWATCQSGESSAYVKRHL 
LQCFWIGILASIKTDNAPGYTSQALATFFS 
IK^^KHITGIPYNSQGQAIVERMNLSPETAV 
AKSKKKGGKQGLRGHPICN 


2911 


A 


3 


415 


PTrjciTD Qon^VQ^PPVOPT? filCR AMYHSAA 
ELVSRGFPRPPVQAPAEPAGAAEGVHSQPA 
SRQEA/GS/TEVRGQAHRFVSPPNAAGAGD 
fr/Pr>POST T APTNRPCPPGGISPARSEPVPPA 
PGRAAP * CFPDLPGLAPPLC 


2912 


A 


178 


423 


MLLTPYFLEWKKLWPLAVLSLAWLTYDW 

OTHSQGGRRSAWVRNWTLWKYFRNYFPV 

KLVKTHDLSPKHNYIIANHPHGBLSF 


2913 


A 


52 


228 


MLTLPQSLWMLTRRTICFVPTTVSCRGLLPS 
NPHHELARLISVSOHRVWPHPVGTQYL* 


2914 


A 


447 


1331 


SHPLLSCPEKVSAKLRAAAEAAAEERRTR 

GAGSRGICAGLRSVAPGPEPLKQEEGRRE 

WGSSIGTPSPCGSAQAAAAAAAEEATEKIP 

ALRPALLWALLALWLCCATPAHALQCRD 

GYEPCVNEGMCVTYHNGTGYCKCPEGFL 

GEYCQHRDPCEKNRCQNGGTCVAQAMLG 

vattdpa cr;PTrjpr>prkV5TQHPPFV^RPPT 
JS-A X CKvAour i uw^y ioi onrur v oivr 

NGGTCl^LSl^TYECTCQVGFTGRNPKCP 

GGNLNYQl^GIlWx 7 SGGSVPPSGTKTSKP 

AFTWAMGTGSKNFASGTLWVMVSGATST 

STSTL 


2915 


A 


160 


409 


DSPTSVIWSSSTGKYSPHPSAGRWRGYCP 
RRVLCCPSPEAALEPGRARAQGIRGDSPW 
HGPTCTOPGRKTVIVGIQLPTQAI 


2916 


A 


1578 

• 


685 


WLQQGIAQRTHLIGRTY'QSWLAIMPGCNH 

SMTQLHMI^GLRIYHNKSAPVIEVYCPQKP 

ICKQNWTWIJEIMNVFVWEDCIAKQAEVLC 

NNSYGIIIDWSPKGMFSLNCTCQSVCHSHT 

MFSWSEQNSQMVEMVRNTARW1TWKRG 

GlYAPQPQNflWSTVEAKHKDLWKLLMSV 

Mlf TK'TWFRTKTCHT FOTTSTNI FLDMAKLKEO 

IFKASQAHLTLMPGTGVLKGAADKLAASN 

PLKWMKTLGSSVISMMIVLLICVVCLCVV 

CRCRS*LLREVAHRDKAAFAFIALQKQEG 

(TV A OP 


0017 


A 


no 




KWKKYPLGFQTFSNNSQ WDTSEFLCS SLL 
YVLGVSSQNAVNQYSffiRSlYGGDCCPFFP 
WYVlfflSWATOCEQRLFLAQQQQEDHEDC 
TKFEVPH 1 


2918 


A 


2 


335 


EDx^AFRPRQPOTLHPLHARSlJ^PRSP 
PPSPDTQLGLSGPTSGPESAPTA/PGNPSWR 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 

location of 

firct amino 

ill ai «i uinm 

acid residue 
of peptide 
senuence 


Predicted 

ending 

nucleotide 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










SSRWGSSSPCAASST*KSPYP*/CSPT/CAFP 
SPRLPFCRSAYQPAAGAGRGK 


2919 


A 


486 


248 


VRQLFSLLLPRLECNGVISAHCNLRLPGSC 

DSSASAS*VAPJTGASGSQAWLQVQCLQP 

VQPGELLRVDLFQLVVLQR 


2920 


A 


3 


535 


AARQQHCTQVRSRRLMKELQDIARLSDRFI 

SVELVDESLFDWNVKLHQVDKDSVLWQD 

MKETNTEFILLNLTFPDNFPFSPPFMRVLSP 

RLENGYVLDGGAICMELLTPRGWSSAYTV 

EAVMRQFAASLVKGQGRICRKAGKSKKSF 

SRKEAEATFKSLWKTHEKYGWGHPARVP 

DG 


2921 


A 


3384 

• 


1260 


AGQTPGHRASGPSERSPAPRSRLQPGGEAA 

TRTEPATPGRRAGPGSATMEALMARG\AL 

TGPLRALCLLGCLLSHAAAAPSPIIKFPGDV 

APKTDKELAVQYLNTFYGCPKESCNLFVL 

KDTLKKMQKFFGLPQTGDLDQNTIETMRK 

PRCGNPDVANYNFFPPJCTKWDKNQITYRn 

GYTPDLDPETVDDAFARAFQVWSDVTPLR 

FSRIHDGEADIMINFGRWEHGDGYPFDGK 

DGLLAHAFAPGTGVGGDSHFDDDELWTL 

GEGQVVRVKYGNADGEYCKFPFLFNGKE 

YNSCTDTGRSDGFLWCSTTYNFEKDGKYG 

FCPHEALFTMGGNAEGQPCKFPFRFQGTSY 

DSCTTEGRTDGYRWCGTTEDYDRDKKYG 

FCPETAMSTV GGNSEGAPCVFPFTFLGNKY 

ESCTSAGRSDGKMWCATTANYDDDRKW 

GFCPDQGYSLFLVAAHEFGHAMGLEHSQD 

PGALMAPIYTYTKNFRLSQDDIKGIQELYG 

ASPDIDLGTGPTPTLGPVTPEICKQDrVFDGI 

AQIRGEIFFFKDRFIWRTVTPRDKPMGPLL 

VATFWPELPEKIDAVYEAPQEEKAVFFAG 

NEYWIYSASTLERGYPKPLTSLGLPPDVQR 

VDAAFNWSKNKKTYIFAGDKFWRYNEVK 

KKMDPGFPKLIADAWNAIPDNLDAVVDLQ 

GGGHSYFFKGAYYLKLENQSLKSVKFGSI 

KSDWLGC 




A 




575 


RRAQGEPERRAPSLAWTCRDPIPTREELAL 

TSTTTSCISSLSIVPFQTILVGDSGVGKTSLL 

VQFDQGKFIPGSFSATVGIGFTNKVGTVDG 

VREKLPRWTPAGKERFRSVTHAYYRDAHG 

•FLLYDPNHPJSLLRLSAL 


0001 






207 


MWHLSV 


2924 


A 


3 


453 


VRSDMNSNPLVDGRYRAPPAPRAPAEAGAS 

SQP*SPPAAQASGKEGGENNAPLFQ*TPLPT 

TPTDTLSVPNPRAPVPPSDRFLRSRPPGPRPS 

FPFRLQGGGGAPH*RGSSATPTPPA/SAPGP 

GVRSLPRPRWWTPIRLKKPWOKSADPSLQ 


2925 


A 


711 


4 


GARFACLCSTTPAPMASCLGLLILSSCLLA 
DCRFIPEAWSACTVTCGVGTQVRIVRCQV 
LLSFSQSVADLPIDECEGPKPASQRACY AG 
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Table 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /-possible nucleotide 
deIetion,=possible nucleotide insertion) 










PCSGEffEFNPDETDGLFGGLQDFDELYDW 

EYEGFTKCSESCGGGVQEAWSCLNKQTR 

EPAEENLCVTSRRPPQLLKSCNLDPCPASSL 

WEPKCVGKGHQLFYLTTVLSSRKKQYRL 

SMERLQRSLLGNQEAWLLILLSPTSSVA 


2926 


A 


2126 


2241 


RQGFHHVGQAGLKLLTSGDLPALASQSAG 
IAGMTHSAR 


2927 


A 


830 


1143 


NDQSALVPvARSSFSKSVKPRTHQFFHMFNI 
GPARDGPPPPSPAPHGPGTLPYRGSSRPGSP 
PPPPRTPPVSSFLCHSSGAPVTRRDAAAQA 
HLLCSRFPFSFIG 


2928 


A 


1 


782 


MTKIQEPSTSVKFLGVQWSGAYQDIPSKV 

BCDKLLHLAPPTTTKEAYLGL/FGFWRQHIP 

H/LGTEQEKTLQHVQAAVQVALFLEPYDP 

ADPMVLEVSVADRDAIWSLWQAPISESQW 

RPQGFWSKALPSSAANYSPFERQLLAYYW 

ALVETEHLTMGHQVTKQPELPIMNWVLSD 

PSSHKVGCAQQHSIIKWKWYICDRARAGP ! 

EGTTTPVrTQWAHEQSGHGGRDGGYTWA 

QQQGLPLTKADLATATAECPICQQQRPTLS 

P 


2929 


A 


1 


274 


MARATLSAAPSNPRLLRVALLLLLLVAAS 
RRAAGASWTELRCQCLQTLQGIHLKMQS 
VhAATLKNGKKACLNPASPMVQKHEKILN 
NP 


2930 


A 


1 


1236 


MLIGSSEQEVANTLDLFVRHLHAREWEIKL 

TKIQGPSTSVKFLGVQWYGACQDIPSNVK 

DTLLHLAPPITKKEAQCLLGLFGFWRQHIP 

HLELPDCNWVLSDP SS YKVGCAQQ YSIIK W 

KWYICDWAQANPEGTINGLARWSGTWKK 

HNWKIGDKEIWGRGMWMDLSEWSKTVKI 

WSHVSAHQQMTSAEEDFNNQVDRMTRS 

MDTTQPLSPTTPVITQWAHEQSDHGGRDG 

DYTWAQQHGLPLTKSFTFAKEVWQWAHA 

HGIHWSYVPHHPEAAGLIERWNGLLKSQL 

KCQLGDNTLQGWGKVLQKAMYALNQHPI 

YGTVSPIARLHGSRNQGEEVEVAPLIITPGD 

LLAKFLLPVSTTLHSAGLGWYGFKLTRD 

GLVMVNTECQLDRIEGCKVLFLGVSVRVS 

PKEINI 


2931 


A 


3 


714 


RPPFIALCI^NVAFMLPWQFAQFILFTQIAS 

LFPMYWGYffiPSKFQKIIYMNMISVTLSFI 

IMFGNSMYI^SYYSSSLLMTWAIILKRNEI 

QKLGVSKLNCWLIQGSAWWCGTIILKFLTS 

KILGVSDfflCLSDLIAAGILRYTDFDTLKYT 

CSPEFDFMEKATLLIYTKTLLLPVVMVITCF 

IFKKTVGDISRVLATNVYLRKQLLEHSELA 

FHTLQLLAFTALAILDLRLKLVL 


2932 


A 


1 


699 


MRFVMSVTMYHTTLVGLDIKHLNLESGKV 

WVMGKASKEPRLPIGRNAVAWffiHWLDL 

RDLFGSKDDALFLSKLGKRISARNVQKRFA 
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Table 8 



SEQ 
ED 


Method 


Predicted 
beginning i 
nucicuuuc 
location of 
first amino 
acid residue 
of peptide 

camiani*A 


Predicted j 
ending 

niirlpntiHp 

UUUWUIK 

location of 
last amino 
acid residue 
of peptide 

« win price 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










EWGIKQGLNNHVHPHKLRHSFATHMLESS 
GDLRGLFKFVSAKRHAKGSKVGSPIIYADQ 
HIGAGQNHPARWRGLPRKSRLLVSPSNDK 
RKKAGAAPVAALRHFPPISEENATVKIQFRn 
RRLNHQQLVKPYPQVPISQATNQFR 


2933 


A 


1 


924 


MFAHSYSSLAAVLLTATLTAAGnSFPVALC 

LVIGANLGSGLLAMLNNSAANAAARRVAL 

GSLLFKLVGSLHLPFVHLLAETMGKLSLPK 

AELVIYFHVFYNLVRCLVMLPFVDPMARF 

CKTIIRDEPELDTQLRPKHLDVSALDTPTLA 

LANAARETCALATPWTDDGRKYAYSAAS 

GGRRSATKVMVVVTDGESHDGSMLKAV1 

DQCNHDNILPJ'GIAVLGYLNRNALDTKNLI 

KEKAIASIPTERYFFNVSDEAALLEKAGTL 

GEQEFSIEDMDLGDEVYTVGRPHPMIDPTL 

RNQLIADLGAKPQVRVLLLDWIGFGATA 

DPAASLVSAWQKACAARLDNQPLYAIATV 

TGTERDPQCRSQQIATLEDAGIAWSSLPE 

ATLLAAALIHPLSPAAQQHTPSLLENVAVI 

NIGLRSFALELQSASKPWHYQWSPVAGQ 

GKWLANPELLEADADAEYAAVTDIDLADI 

KEPELCAPNDPDDARPLSAVQGEKIDEVFIG 

SCMTNIGHFRAAGKLLDAHKGQLPTRLWV 

APPTRMDAAQLTEEGYYSVFGKSGARVSSI 

PCAVPCVWARVADGATWSTSTRNFPNRL 

GTGANVFLASAELAAVAALIGKLPTPEEYQ 

TYVAQVDKTAVDTYRYLNFNQLSQYTEK 

ADGLLKPRFBPWQB 1CTT DTT ATYHEQHRD 

EPGPGRERLRRMALPMEDEALVLLLEEKM 

RESGDIHSHHGWLHLPDHKAG*SSDNGKY 

QRLFYLPAPRRSGTLPASAVCQSAPQQ/LA 

SSAEARKTFAPVPRRFGKLRVEVETTVAPS 

ATRAHTOGTAQGILDTRAPLLPKTL 


2934 


A 


201 


632 


MPGLLNWITGAALPLTASDVTSCVSGYAL 

GLTASLTYGNLEAQPFQGLFVYPLDECTTV 

IGFEAVLADRVVTVQIKDKAKLESGHFDAS 

HVRSPTVTGNILQDGVSIAPHSCTPGKVTL 

DEDLERILFVANLWT1APMYRAVWD 


2935 


A 


267 


25 


MGAVQRIJVIKnMLNYRLVAHFLVLFAQK 

KANRQRTRVHRGSLWLSECESPNGPGGRH 

TEPAEGRQARGRTPQQGFAVSLM* 


2936 


A 


34 


330 


MNKHFIJLFIXYCLIAAVTSLQCITCHLRT 
RTDRCRRGFGVCTAQKGEACMLLRIYQRN 
TLQISYMVCQKFCRDMTFDLRNRTYVHTC 
CNYNYCNFKL* 

X«y X i X X ™ A V^X ~ A JUTfcJi / 


2937 


A 


34 


411 


MTAGTVVITGGILATVILLCIIAVLCYCRLQ 
YYCCKKSGTEVADEEEEREHDLPTHPRGP 
TCNACSSQALDGRGSLAPLTSEPCSQPCGV 
AASHCTTCSP YS SPFYIRTADMVPNGG GGE 
RLSFAP 


2938 


A 


333 


545 


MMPTNLAHLVFWQALLASGRFSLMEHYP 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 

nucleoli ue 

location of 

first amino 

11131 aiiuuu 

acid residue 
nf n£ntide 

sequence 


Predicted 
ending 

niiflpntiHf* 
uuticuuuc 

location of 
last amino 
acid residue 
of oentide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deIetion,=possible nucleotide insertion) 










PNVQSNRGITHYMLPRGYELGLLYSSAGNT 
GTSRPRRTHYGT* 


2939 


A 


242 


382 


MNRVMRGLAITTTCLLSN1LQAITISPSILW 
NHAAVQYVHGHSLVQA* 


2940 


A 


108 


290 


MPQWLALQRQQALLTLLSGAGTWAGMRP 
PSOCWPOGPSTGNOSLSHGRGELLTHAVG 
VCI* 


2941 


A 


109 


417 


MLMLILVTGVSSLRNMIMCDYISRAKLKSS 
HTVLSYCTLKQEYDDSRGVMNLEAREEGS 
RGFYCLGCIDTGLQTPGGRGPSSALVTSVH 
LACEEYSKHSFVK* 


2942 


A 


155 


575 


RRAQGEPERRAPSLAWTCRDPEPTREELAL 

TSTTTSCISSLSIVPFQTILVGDSGVGKTSLL 

VQFDQGKFIPGSFSATVGIGFTNKVGTVDG 

VREKLPIVWTPAGKERFRSVTHAYYRDAHG 

♦FLLYDPNHPJSLLRLSAL 


2943 


A 


429 


1 


RLVYASTANKIHF*NDNNPGKNTDTVPHC 

HKLCNQDSHIRGNHRGQHIHSKTAKPCSG 

KTTFVTITFLLSDKHKYKLAPLRPAAASYSS 

PFTRKVTCLTRITEPS*P*HTAATLRSDQRS 

QTCSHGTGTLSWRSSRWRSSSTK 


2944 


A 


1728 


2782 


RASSAVRGSLGDSARGRRRRSIVKVSLHPA 

VMSKSESPKEPEQLRKLFIGGLSFETTDESL 

RSHFEQWGTLTDCWMRDPNTKRSRGFGF 

VTYATVEEVDAAMNARPHKVDGRVVEPK 

RAVSREDSQRPGAHLTVKKIFVGGIKEDTE 

EHHLRDYFEQYGKIEVIEIMTDRGSGKKRG 

FAFVTFDDHDSVDKIVIQKYHTVNGHNCE 

VRKALSKOEMASASSSORGRSGSGNFGGG 

RGGGFGGNDNFGRGGNFSGRGGFGGSRG 

GGGYGGSGDGYNGFGNDGSNFGGGGSYN 

DFGNYNNQSSNFGPMKGGNFGGRSSGPYG 

GGGQYFAKPRNQGGYGGSSSSSSYGSGRR 

F 


2945 


A 


234 


657 


VQQPGRGLDLSTDGPGGRSQVGLIWSCCC 

LH*AASGEPGGRCPGS/GAPGPAGSALEFR 

ARDGVPVGVGGPSWESHSPAAATPPPAECR 

GPGPTPSPAPGEAAPEDREDGAAAPGRAEP 

ASrVAPADGSQGQVLATQAGALGA 


2946 


A 


1725 


2140 


Y1YQISQTSGKL*PGDKSVHSELV/SSCNTSI 

ISSSGISSTSLL*LRRLFSAASANSASSVASK 

K*ASSMPLSQTASADAPVDSLLGDGL*GF 

WVSLLLVSSASSWNSSSSLPCKNRRHTSAG 

NGKQSDLKFFALHTGS 


2947 


A 


1 


1134 


DTYCRGDQLHELLWRDHLGRRKQYGGDF 
LRARRSSPALMAGASGKVTDFNNGTYLVS 
FTLFWEGQVSLSLLLEHPSEGVSALWSARN 
QGYDRVIFTGQFVNGTSQVHSECGLILNTN 
AELCQYLDNRDQESFYWVRPQHMRCAAL 
THMYSKNKKVSYLSKQEKSLFERSNVGVE 
DvffiKFNTISVSKCNTLKSVDLHESGKLQHQ 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 
nucieouue 
location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

niipIpnfiHp 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 










IJVVDLDRNINIOWOKYCYPL1GSM T YSVK 

EMEYLTRATORTGGEKNTVTVISLGQHFRP 

FPIDWIRRALNVHKAIQHLLLRSPDTMVII 

KTENIREMYNDAERFSDFHGYIQYLIIKDIF 

QDLSDIRHVLKYNASKNAADLDLFFSSNL 

DDFYNFSELHKGRSKSPLMQITQ 


2948 


A 


504 


198 


QLIQHQTVHTGRKLYECKECGKAFNQGST 
LIRHQRJHTGEKPYECKVCGKAFRVSSQLK 
QHQRIHTGERPYQCKELKGRGAEMLAVLA 
VKEQNRTPVNYGK 


2949 


A 


1 


578 


MGETALMIQLPPPGPALGTWGLWDLQFKT 

KITSTDTDPRSHLOETGDNILTIJFTMHPPL 

ESEWTICNFRQIWLLSSWSTLETRAQPLHS 

YFRKLKGRGTAIAGIVFGIVFIMGVIAGIAI 

CICMCMKNEIRATRVGILRTTHINTVSSYPG 

PPPYGHDHEMEYCADLPPPYSPTPQGPAQR 

SPPPPYPGNARK 


2950 


A 


1 


943 


AAAGRARGAGDMFRRKQSNPRQIKRSLGD 

MEAREEVQLVGASHMEQKATAPEAPSPPS 

ADVNSPPPLPSPTSPGGPKELEGQEPEPRPT 

EEEPGSPWSGPDELEPWQ/DGRRRIRARLS 

LATGLSWGPFHGSVQTRASSPRQAEPSPAL 

TLLLVDEACWLRTLPOALTEAEANTEIHRK 

DDALWCRVTKPVPAGGLLSVLLTGEPHST 

PGHPVKKEPAEPTCPAPAHDLQLLPQQAG 

MASELATAVTNKDVFPCKDCGIWYRSERNL 

QAHLLYYCASRQGTGSPAAAATDEKPKET 

YPNERVCPYPQSRKSCPG 


2951 


A 


2 


435 


AVCRTSSDVDDNPPVFNQLIYESYVSELAP 
RGHFVTCV QASD ADSSDFDRLEYSILSGND 
RTSFLMDSKSGVITLSNHRKQRMEPLYSLN 
VSVSDGLFTSTAQVHIRVLGANLYSPAFSQ 
STYVAEVRENVAAGTKVIHVRATD 


2952 


A 


199 


399 


MPGSLCGRRWCWLLGSVTSKQVLl^Ui,K 

KFSRSSRLQEDQERSLGFRPFTHSPDMMW 

DLPAQDEWS 


2953 


A 


38 


397 


TVLCLTLTSCSFRQSLAT*SFGG/MGSGSVH 

FGVGGAFLEPSIHWGS/GSRSLSVSSTHFVP 

SSSS/GGYGSGDASVLCRSDRLLTGTKITTQ 

NIHD/RLGSYLDKVRALEEAG\ELKVKICD 

WAP 


2954 


A 


2 


673 


NSRVEGQLCDLDPSAHFYGHCGEQLECRL 

DTGGDLSRGEVPEPLCACRSQSPLCGSDGH 

TYSOTCRLOEAARARPD ANLTVAHPGPCES 

GPQIVSHPYDTWNVTGQDVIFGCEVFAYP 

MASIEWRKDGLDIQLPGDDPfflSVQFRGGP 

QRFEVTGWLQIQAVRPSDEGTYRCLARNA 

LGQVEAPASLTVLTPDQLNSTGIPQLRSLN 

LVPEEEAESEENDDYY 


2955 


A 


1 


440 


GNQKCTRNNHRISSLLCDPQEGYLQMLQIS 
m.YLYDSVLMLANAFHRKLEDRKWHNM 
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Table 8 



SEQ 

ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acia residue 
of peptide 
sequence 


Predicted 
ending 
nucieouue 
location of 
last amino 

«niu I caiuuc 

of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 

Helatfnn snnccihlp nnrlpntirip insertion) 

UClCliUli) UUJMUIC UUUWUUC UUvl 










ASLNCIRKSTBCPWNGGRSMLDTIKKGHITG 
LTGVMEFREDSSNPYVQFEILGTTYSETLVE 
FPFVMVAENTLGOPKRYKGFSIDVLDALA 


2956 


A 


23 


395 


GSGDAGGQHRARCPSGRAGNWDWHPPA 

MEEPGPPGGLSQDQVERCMGAMQEGMQ 

MVKLRGGSKGLVRFYYLDEHRSCTRWRPS 

RKNEKAEISIDSIQEVSEGRQSEVLQRYPDG 

SFDPNCCCSI 


2957 


A 


663 


1 AA 

144 


K"BT SAV? AfJTPW^PO^OOPfrGOSVAACVP 

AAPAAAGLCSGRAQKVPPPPSLAGWPPGV 

NAPPPPVCSSVRLHVCQSDRLWVRLAARR 

GILALLRSALKAATLAGCQSVRWSVRPSES 

LRPTSNAASLFRSSVPTVLSHSVPLAASLG 

KRRACGGREHASVAVYLSVCLSLPT 


2958 


A 


1856 


591 


PPTPTAETLTSEDAQPGSPLATGTDQVSLD 

KPLSSAAHLDDAAKMPSASSGEEADAGSL 

LPTTNELSQALAGADSLDSPPRPLERSVGQ 

U>SPPLIJ > TPPPKASSKTVKKMSQAKPHSSK 

PPA*RWTI7PIJRGQI^TPTGSPHLTTVHRP 

LPPSRVIEELHRALATKHRQDSFQGRESKG 

SPKKRLDVRLSRTSSVERGKEREEAWSFD 

GALENKRTAAKESEENKENLIINSELKDDL 

LLYQDEEALNDSnSGTLPRKCKKELLAVK 

T PKTRP^KTiFT FDRNTFPRRTDEEROEIROOI 

EMKLSKRLSQRPAVEELERRNILKQRNDQ 

TEQEERREIKQRLTRKLNQRPTVDELRDRK 

ELIRFSDYVEVAKAQDYDRRADKPWTRLS 

AADKAAIRKELNEYKSNEMEVHASSKHLT 

RFHRP 


2959 


A 


1578 


685 


WLQQGLAQRTIILIGRIYQSWLAIMPGCNH 

SMTQLHMLSGLRIYHNKSAPVIEVYCPQKP 

ICKQNWTWLEIMNVFVWEDCIAKQAEVLC 

NNSYGmDWSPKGMFSLNCTCQSVCHSHT 

MFSWSEQNSQMVEMVRNTARVPIIWKRG 

ftTVAPOPONflWSTVEAKHKDLWKLLMSV 

NKJKTWERIKKHLEGHSTNLFLDMAKLKEQ 

IFKASQAHLTLMPGTGVLKGAADKLAASN 

PT KWKfKTLGSSVISMNflVLLICVVCLCVV 

CRCRS*LLREVAHRDKAAFAFIALQKQEG 

GYAGE 


2960 


A 


470 


258 


MnAIGGVTVASGLVFrVLLMTRYKVYGDG 

DSRRVKGSRALPRVRHVCSQTNGAGTGAE 

OAPALPAQDHY* 


2961 


A 


3 


866 


ELNLQDFSHIJDHRDLIPIIAALEYNQWFTK 

I^SKDLKLSTDVCEQILRVVSRSNRLEELV 

I^NAGLRTDFAQKIASALAHNPNSGLHTI 

NLAGNPLEDRGVSSLSIQFAKLPKGLKHLI 

I^KTHYYPKAVNSLSQSLSANPLTASTLVH 

II)I^GNSnJlGDDI^HMYNFLAQPNAIVHL 

DLSNTECSLDMVWGALLRGCLQYLAVLN 

LSRTWSHRKGKEWPSFKQFFSSSIA1MHI 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
iirsi amino 
acid residue 
of peptide 
sequence 


Predicted 
ending 

minlnnfina 

nucieouue 
location of 

14al alllillU 

acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 

codon, /^possible nucleotide 

Hplpfinn =nnc«ihle nucleotide insertion) 










NI^GTiaSPEPIXALLLGLACMINLKGVSL 
f)T <?NCFLRSGGAOVLEGCIG 


2962 


A 


574 


203 


TQAFEQEVGNPLCIPSHCMGAVFILLNLAT 
AHSSGLCLLQLELSFRSLSTTAVHCCPRPTI 
DFHP/LGSSRVSAVLLIQ/QRCPLPLPIGLEA 
DHCSCMAKGPGFILIELNTSHWVPQFSSVT 


2963 


A 


399 


15 


NTMVAHHTVENTYFCPVLATGLSGLYSSLP 

TKLEEKGEEWHCLLKDDWLLLPSLVQFM 

NSLEFCNAVIQVAHPLIRNQLVIYISNEFLV 

PVLAPALHKVPVQEVMSPTAYLDLFVRSIS 

EPALLEBF 


2964 


A 


3 


567 


CSEIFASLRLPRIMAHSKQPSHFQSLMLLQ 
WPT <?YT A TFWTT OPT .FVYLLFTSLWPLP VL 
YFAWLFLDWKTPERGGRRSAWVRNWCV 
WTffiRDYl^ITx^^ 

HGLLTFGAFCNFCTEATGFSKTFPG1TPHLA 
TI^WFlUPFVx^YLMAKGASDHTYWSFW 
<?MFT T GNAPF 


2965 


A 


2 


394 


TLADGGEGQFDGTFEPATVALPGGEHAEN 

AVQIHKWTGTMALIFSFLIAALVLYVSWK 

CFPASLRQLRQCFVTQRRKQKQKQTMHQ 

MAAMSAQEYYVDYKPNHIEGALVIINEYG 

VTrTTOOPARFCEV 

Ov X wXXV^V^X ^\_L\J_/\_/i_/ V 


2966 


A 


2 


412 


EFLSSNQITQLPNTTFRPMPNLRSVDLSYN 

KLQALAPDIJHGLRKLTTIJHMRANAIQFV 

PVPJFQDCRSLKFLDIGYNQLKSLARNSFA 

GLFKLTELHLEHNDLVKVNFAHFPRLISLH 

SLCLRRNKVAIWSSLDW 


2967 


A 


1 


1343 


ERCKVQSSTLVSSLEAELSEVKIQTHT/QQE 

NHLLKDELEKMKQLHRCPDLSDFQQKISS 

VLSYNEKLLKEKEALSEELNSCVDKLAKSS 

LLEHRIATMKQEQKSWEHQSASLKSQLVA 

SQEKVQNLEDTVQNVNLQMSRMKSDLRV 

TQQEKEALKQEVMSLHKQLQNAGGKSWA 

PEIATHPSGLHNQQKPJLSWDKLDHLM/NV 

EEQQLLWQENERLQTMVQNTKAELTHSRE 

KVRQLESNLLPKHQKHLNPSGTMNPTEQE 

KLSLKRECDQFQKEQSPANRKVSQMNSLE 

r»PT FTTHT FMFOT KKKOVKLDEOLMEMOH 

LRSTATPSPSPl^WDLQLLQQQACPMVPR 

FOFT OT OROLLOAERJNOHLOEELENRTSE 

TNTPQGNQEQLVTVMEERMIEVEQKLKLV 

KRLLQEKVNQLKEQVSLPGHLCSPTSHSSF 

NSSFTSLYCH 


2968 


A 


382 


203 


RPSSPGPPCPEAGKR/RFGCGGAGSLRPEHS 
\TRPPPRGLGKGRGQREKRGASKEGSEGCA 


2969 


A 


303 


46 


AWFKLLSPRKKHLKNPFVGGVGCAWRT 
GWEWSPGQEQAPPPATGSMLATSSPPSGPP 
PPP*PPGFMLPPLGDGLGAGTSAGRS*EKG 
RGK 



WO 03/080795 
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Table 8 



SEQ 

ED 

NO* 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
acid residue 
of peptide 1 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possible nucleotide 
deletion,=possible nucleotide insertion) 


2970 


A 


3 


586 


MVECPACQH*RPTLSLRDTSYHQVECIRSL 

LPWNGHQFVLTRIDICSK*/G/FVFPNYFASS 

STTI*ELTGCLIH*HT*N*GTH/LIAKEV*Q*T 

RSYKI/HWCI/PHHPEAASQIGFWNGLLKTG 

L/QLRLRCNALQS/W GAVLQNMVYALKCI 

GPKWrYSIVSPVGNHVHTGVASnTTPSHSPV 

EFVPPRSEIWSQLGYDP 


2971 


A 


299 


21 


MGSSVLSIWILSPSIYPILSPLAMPCLSRTDL 
IRVRRIQGAWPSEGTASSIRGWVLTKLRMS 
SGKALEALYCIPGAAQHPGLGVTRVWSGR 
T* 


2972 


A 


1 


555 


KKVGNYYTTPIYRFRMKCHLCVNYIEMQT 

DPANCDYVTVSGAQRKEERWDMADNEQV 

LTTGERHPLTCLGAL/DPESALGPPKPSRAL 

IVAEHEKKQKLETDAMFRLEHGEADRSTL 

KKALPTLSHIQEAQSAWKDDFALNSMLRR 

RFRVRGAPARGQRGCMVDQGPGPALPPPH 

PSFEQATCTF 


2973 


A 


1 


598 


MAWIPAALGTAALVPWSILRGKAPRYWL 

LPLLLDPDKVPIISARDLTSPDAALASLTAQ 

SGGLEELHLKLVHEVAVMANTECQLDWIE 

GCKVLrLACRLWDLVIMTHPAFYQSVQWG 

KGNDQTFQGRLDTGCELMLIPGDPNCGPP 

VKVGVYGGITYHCDLTKEELEPRVFREVTV 

KGIDASDYQTVQLPKGTESSRN 


2974 


B 


1 


2142 

i 


MGGAGSPQVILVSHTPQSASAACEELAYQV 

AGVSGNLAPGNQPEKEGRAHQCLECDRAF 

SSAAVLMHHSKEVHGREPJHGCPVCRKAF 

KRATHLKEHMQTHQAGPSLSSQKPRVFKC 

DTCEKAFAKPSQLERHSRIHTGERPFHCTL 

CEKAFNQKSALQVHMKKHTGERPYKCAY 

CVMGFTQKSNMKLHMKRAHSYAVAVAM 

GGTAQCPPGATACLGTAICPSGLRAQRPSN 

I^WEAAKPKSGRNRKIEAPTWALSTSKDP 

QTEGLRNPQTCVQIRSNPFCAFAQGFSLISE 

LRTLNCFVGLCDSQSGKQQLGFYSGQPAT 

EAWQKYSIAVC^RSEQEISATRLGUCNTN 

VNKLDGGCGAWNFLGGMSEHNSPPSGRAI 

LLPVVFTEVFPGPWTPEQGSfflCRMNLAPT 

FQAFLPKTGFPIDPQELLQGPIERTTWPGTV 

YTFRSAIVTARAVWVRPRMDRRADLSSAT 

QSASAEKFGGRVSAGHCALPLPARPVTAS 

VYGRLARLRGCLEDSYPSALSAQVFLDSPA 

VGCGLETRLFIEAALGPPCRATVTSRGHLL 

DISrTKSPGRPCFLSVCLHGSDQQKRKGAA 

ATAKRKSKGGGVNVEGRLCTWPPEDPPKS 

WSLAFGPLQEKTTELNLHPRCW ARCLSHW 

ELPPGPRGRAQAPDWTGSKSFREQLLTFTL 

WGVQEKISKHQANQGKEAPAYTGLEDSDP 

GGLCAV* 


2975 


A 


248 


597 


DRCPAAWDRHPAGIQSSRREPSKATWTLR 
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Table 8 



SEQ 
ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 
first amino 
add residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possible nucleotide insertion) 










SKLSVQDGRRDSSLRLNCKVAARLGAGHP 
PMLRLGLRC*YPGKQGLEWTSSKLQQTCH 
*GS*LLKGKLTNRKDfflSKTPSVRHYHQR 


2976 


A 


2 


353 


EVDHRGDYVSHEIMHHQRRRRAVPVSEVE 
PLHLRLKGSRHDFHVDLRTSSSLVAPGFIV 
QTLGKTGTKSVQTLPPEDFCFYQGSLRSHR 
NSSVALSTCQGLSGMIRTEEADYFLRPL 


2977 


A 


134 


412 


MVKFIGPRVRRGLESPLCHACYLALCTLAL 
VRLCALSRSRSLSLMLILQAFYRPPMSQEP 
ALSTVLFLLLLLANPPTKVSRSHRKERVLL 
LVA 


2978 


A 


1 


598 


MAFLETSAPLYEHTWTLQVAFSTVGLGETL 

KVAMISMSTSSGYFLQLLQYCCSSTinTGY 

KGFLRDLKVETRADGVMRTMAPEKLLKS 

MPILQGQEDALLEFDVHPNELTNGVINAAF 

MLLFKDLLKLFACYhTOGVINLLGTWMKLE 

TTILSKLLQRQKTKHCMFSLIGGNRTMRTL 

GHRKGNITHWALLAGGGAAEG 


2979 


A 


793 


1 


GSRIDDMKSERRPPSPDVIVLSDNEQPSSPR 

VNGLTTVALKETSTEALMKSSPEERERMIK 

QLKEELRLEEAKLVLLKKLRQSQIQKEATA 

QKPTGSVGSTVTTPPPLVRGTQNIPAGKPS 

LQTSSARMPGSVffPPLVRGGQQASSKLGP 

QASSQWMPPLVRG\AQQfflSIRQHSSTGPP 

PLLLAPRASVPSVQIQGQREQQGLIRVANV 

PNTSLLVNIPQPTPASLKGTTATSAQANSTP 

TSVASWTSTESPASRQAA 


2980 


A 


2 


1427 


LLARGAGRTNPAPPLMSCGPWGKFLKCCE 

VYKSGP YKVQ*EErnHSRAEAESTY QHCYE 

ELQTLAGKHGDDLRCAK/T/EISEMNQNISR 

LQAETEGLKGQGASLEAAIADAEQWGELA 

KDANTKLSELEAAMQRAKQDMA/RQLGE 

YQKLALDIEIATYRKLLEGEESRLESGMQN 

VSIHKKTTSGYAGAPARIVSLLQNELLSLE 

VGVLKGHPTGKGEELGAPYSECSFGLCRR 

TVMLTQAPSSVVRSRNSRNHTVNSGGSCL 

SASTVAIPAINDSSAAMSACSTISAQKRTCC 

TACEPARKYKDTASHQEPAVCQPACQLET 

ADPKGGGVLALPQPPSPGMLCWPYCRAH 

ATDYFLANFFSEFPCHFLHRAGAAQTQAT 

GDGMEHGQSRELPKRKAPREESETSEEKSP 

NKWGPVSKQKKQLLVDILTTnRPTRGNAY 

TGLSTRKWKPRSEENALMQPNKKDEKGTL 

TQKLGL 


2981 


A 


4235 


940 


ARGRRSRPVWAASWGGRGRPAARRRPRG 

LAATMGFELDRFDGDVDPDLKCALCHKV 

LEDPLTTPCGHYFCAGCVLPWWQEGSCP 

ARCRGRLSAKELNHVLPLKRLILKLDIKCA 

YATRGCGRWKLQQLPEHLERCDFAPARC 

RHAGCGQVLLRRDVEAHMRDACDARPVG 

RCQEGCGLPLTHGEQRAGGHCCARALRA 
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Table 8 



SEQ 
ID 


Method 


Predicted 
beginning 

n ti pl pftfi d £ 

llUtlCUUUv 

location of 
first amino 
acid residue 
of peptide 
sequence 


Predicted 

ending 

nucleotide 

M UvlvV UUV 

location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /^possible nucleotide 
deletion,=possibIe nucleotide insertion) 










HNGALQARLGALHKALKKEALRAGKREK 

SLVAQLAAAQLELQMTALRYQKKFTEYSA 

RLDSLSRCVAAPPGGKGEETKSLTLVLHRD 

SGSLGFNnGGRPSVDNHDGSSSEGIFVSKIV 

DSGPAAKEGGLQJHDRIIEVNGRDLSRATH 

DQAVEAFKTAKEPIWQVLRRTPRTKMFT 

PPSESQLVDTGTQTDITFEHIMALTKMSSPS 

PPVLDPYLLPEEHPSAHEYYDPNDYIGDIH 

QEMDREELELEEVDLYRMNSQDKLGLTVC 

YRTDDEDDIGIYISEIDPNSIAAKDGRIREG 

DRIIQINGTEVQNREEAVALLTSEENKNFSL 

LIARAELQLDEGWMDDDRNDFLDDLHNfD 

MLEEQHHQAMQFTASVLQQKKHDEDGGT 

TDTATILSNQHEKDSGVGRTDESTRNDESS 

EQENNGDDATASSNPLAGQRKLTCSQDTL 

GSGDLPFSNKSFISPECTGAAYLGIPVDECE 

RFRELLELKCQVKSATPYGLYYPSGPLDAG 

KSDPESVDKELELLNEELRSffiLECLSIVRA 

HKMQQLKEQYRESWMLHNSGFRNYNTSI 

DVRRHELSDITELPEKSDKDSSSAYNTGES 

C3RSTPLTLEISPDNSLRRAAEGISCPSSEGA 

VGTTEAYGPASKNLLSrTEDPEVGTPTYSPS 

LKELDPNQPLESKERRASDGSRSPTPSQKL 

GSAYLPSYHHSPYKHAHIPAHAQHYQSYM 

QLIQQKSAVEYAQSQMSLVSMCKDLSSPT 

PSEPRMEWKVKIRSDGTRYrrKRPVRDRLL 

REPJUJQREERSGMTTDDDAVSEMKMGR 

YWSKEEPJCQHLVKAKEQRRRREFMMQSR 

LDCUCEQQAADDRKEMNILEIJSHKKMMK 

KRNKKIFDNWMTIQELLTHGTKSPDGTRV 

YNSFLSVTTV 


2982 


A 


792 


389 


PTRPPL\QLQAPRAHLSEDQKRLLLMKQKG 

VMNQPMAYAALPSHGQEQHPVGLPRTTG 

PMQSSVPPGSGGMVSGASPAGPGFLGSQP 

QAAIMKQMLIDQRAQLIEQQKQQFLREQR 

OOOOOOOOILAEQVTCPLA 


2983 


A 


3 


268 


FTRSDELARHYRTHTGEKRFSCPLCPKQFS 
RSDHLTKHARRHPTYHPDMDEYRGRRRTP 
RIDPPLTSEVESSASGSGPGRAPSFTTCL 


2984 


A 


3 


431 


GPEFPGSAKLVFLDLSYNNLTQLGAGAFRS 

AGRLVKLSLANNNLVGVHEDAFETLESLQ 

VLED^NNLRSLSVAALAALPALRSLRLD 

GNPWLCDCDFAHLFSWIQENASKLPKGLD 

EIOCSLPMESRRISLRACRRPASRV 


2985 


A 


108 


497 


MGIYQMYLCFLLAVLLQLYVATEAILIALV 
GATPSYHWDLAELLPNQSHGNQSAGEDQ 
AFGD WLLT ANGSEIHKHVHFSS SFTSIASE 
WFLIANRSYKVSAASSFFFSGVFVGVISFG 
QLSDRFGRKKVY 


2986 


A 


488 


754 


QSIYQEKFDDENFILKHTGPGILSMANAGP 
TQMVPSFSPVWPRLSGWMASTRSLAK*EE 
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TableS 




SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location of 

fi r^t amino 

acid residue 
of peptide 
sequence 


Predicted 
ending 
nucleotide 
location of 
last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (X=Unknown, *=Stop 
codon, /=possibIe nucleotide 
deletion,=possible nucleotide insertion) 










GVNIMEAMECSGSGNGETGKKIPTAXCGQ 
L 
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Table 9 



N/V nf full 
liU. OI lull" 

nucleotide 
sen ue nee 


CUT* Tf\ 

lVn» nf full 
IiImF* 01 lull- 

senuence 

JvUUVUWV 


vri. nf 
nucleotide 

lu. UVlfcU nut 

senuence 


SjIjKI 1X1 

lVrt» nf 
nu. OI 

rnnHff 
tuuug 

n en tide 
sequence 


lUcIlUlllaUUIl Ol 
Prinritv Annlicntinn 
that rnntiix nucleotide 

sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 












1 


1042 








2 


1043 








3 


1044 








4 


1045 








5 


1046 


2083 


2535 


790 104 


6 


1047 








7 


1048 








g 


1049 








9 


1050 


2084 


2536 


790 16362 


10 


1051 








11 


1052 








12 


1053 








13 


1054 








14 


1055 








15 


1056 








16 


1057 








17 


1058 


2085 


2537 


784 5743 


18 


1059 


2086 


2538 


790 167 


19 


1060 








20 


1061 


2087 


2539 


788 2001 


21 


1062 








22 


1063 


2088 


2540 


784 1683 


23 


1064 


2089 


2541 


785 1699 


24 


1065 








25 


1066 








26 


1067 


2090 


2542 


789 5434 


27 


1068 








28 


1069 


2091 


2543 


790 13996 


29 


1070 








30 


1071 








31 


1072 








32 


1073 








33 


1074 


2092 


2544 


784 6213 


34 


1075 


2093 


2545 


784 1993 


35 


1076 








36 


1077 


2094 


2546 


790 3341 


37 


1078 


2095 


2547 


791 5740 


38 


1079 








39 


1080 


2096 


2548 


792 4643 


40 


1081 








41 


1082 








42 


1083 








43 


1084 


2097 


2549 


790 407 


44 


1085 








45 


1086 


2098 


2550 


785 1457 


46 


1087 


2099 


2551 


790 20129 


47 


1088 








48 


1089 


2100 


2552 


790 18963 


49 


1090 


2101 


2553 


790 515 


50 


1091 


2102 


2554 


787 7703 
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Table 9 



SEQID 


SEQID 


SEQID 


SEQID 


Identification of 


NO: of full- 


NO: of full- 


NO: of 


NO: of 


Priority Application 


length 


length 


contig 


contig 


that contig nucleotide 


nucleotide 


peptide 


nucleotide 


peptide 


sequence was filed 


sequence 


sequence 


sequence 


sequence 


(Attorney Docket 










No._SEQ ID NO.) * 


51 


1092 








52 


1093 








53 


1094 


2103 


2555 


784 7239 


54 


1095 


2104 


2556 


790 19031 


55 


1096 


2105 


2557 


791 1750 


56 


1097 








57 


1098 








58 


1099 








59 


1100 


2106 


2558 


790_23024 


60 


1101 








61 


1102 


2107 


2559 


788 3666 


62 


1103 








63 


1104 


2108 


2560 


787 2031 


64 


1105 








65 


1106 








66 


1107 


2109 


2561 


784 2939 


67 


1108 


2110 


2562 


787 4769 


68 


1109 


2111 


2563 


792J7097 


69 


1110 


2112 


2564 


788 9897 


70 


1111 


2113 


2565 


790 29652 


71 


1112 








72 


1113 


2114 


2566 


784 4530 ! 


73 


1114 








74 


1115 








75 


1116 


2115 


2567 


787 7560 


76 


1117 j 








77 


1118 








78 


1119 








79 


1120 








80 


1121 








81 


1122 








82 


1123 








83 


1124 


2116 


2568 


784_1264 


84 


1125 


2117 


2569 


791 1515 


85 


1126 








86 


1127 


2118 


2570 


784_3498 


87 


1128 








88 


1129 








89 


1130 








90 


1131 








91 


1132 








92 


1133 








93 


1134 J 


2119 


2571 


791 1404 


94 


1135 








95 


1136 


2120 


2572 


784 9584 


96 


1137 








97 


1138 


2121 


2573 


787 7852 


98 


1139 








99 


1140 


2122 


2574 


788 5026 


100 


1141 








101 


1142 


2123 


2575 


790 16594 
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Table 9 



NO of full- 

length 

nucleotide 


SF.O ID 

OJE/^£ JUL/ 

NO: of full- 

length 

peptide 


SEQED 
NO: of 
contig 
nucleotide 


SEQED 
NO: of 
contig 
peptide 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 


sequence 


sequence 


sequence 


sequence 


(Attorney Docket 
No. SEO ID NO.) * 


102 


1143 


2124 


2576 


790 975 


103 


1144 








104 


1145 








105 


1146 








106 


1147 








107 


1148 


2125 


2577 


790 11619 


108 


1149 


2126 


2578 


790 1040 


109 


1150 


2127 


2579 


787 946 


110 


1151 








111 


1152 








112 


1153 








113 


1154 


2128 


2580 


790 19602 


114 


1155 








115 


1156 


2129 


2581 


788 12191 


116 


1157 


2130 


2582 


784 5727 


117 


1158 








118 


1159 


2131 


2583 


784 7669 


119 


1160 








120 


1161 


2132 


2584 


784 5053 


121 


1162 








122 


1163 








123 


1164 








124 


1165 


2133 


2585 


790 9619 


125 


1166 








126 


1167 








127 


1168 


2134 


2586 


790 1144 ! 


128 


1169 








129 


1170 








130 


1171 








131 


1172 


2135 


2587 


790 16699 


132 


1173 


2136 


2588 


790 1170 


133 


1174 








134 


1175 


2137 


2589 


790 1171 


135 


1176 








136 


1177 








137 


1178 








138 


1179 








139 


1180 


2138 


2590 


785 66 


140 


1181 


2139 


2591 


790 11744 


141 


1182 








142 


1183 








143 


1184 


2140 


2592 


784 10222 


144 


1185 


2141 


2593 


790 1217 


145 


1186 


2142 


2594 


785 2455 


146 


1187 








147 


1188 








148 


1189 


2143 


2595 


784 3575 


149 


1190 








150 


1191 








151 


1192 








152 


1193 


2144 


2596 


787 9817 
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Table 9 



SEQID 
INU: oi lull- 

IcUglD 
nui*lpntf Hp 


TVfV nf full. 

lpncrth 

npntide 


NO* of 

contig 

nucleotide 


NO: of 

contig 

peptide 

Mr r 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 




sea uence 


sequence 


sequence 


(Attorney Docket 
No. SEOIDNO.) * 


153 


1194 








154 


1195 


2145 


2597 


784 9353 


155 


1196 ! 








156 


1197 








157 


1198 








158 


1199 


2146 


2598 


784 4306 


159 


1200 








160 


1201 








161 


1202 








162 


1203 








163 


1204 


2147 


2599 


790 23831 


164 


1205 








165 


1206 








166 


1207 








167 


1208 


2148 


2600 


790 1363 


168 


1209 


2149 


2601 


784 1344 


169 


1210 








170 


1211 








171 


1212 


2150 


2602 


787 1542 


172 


1213 








173 


1214 


2151 


2603 


785 2871 


174 


1215 


2152 


2604 


787 5391 


175 


1216 


2153 


2605 


790 27456 | 


176 


1217 








177 


1218 


2154 


2606 


784 1229 


178 


1219 








179 


1220 


2155 


2607 


788 1187 


180 


1221 


2156 


2608 


784 256 


181 


1222 








182 


1223 








183 


1224 


2157 


2609 


790 6023 


184 


1225 








185 


1226 


2158 


2610 


790 28512 


186 


1227 








187 


1228 








188 


1229 








189 


1230 








190 


1231 








191 


1232 








192 


1233 


2159 


2611 


790 27560 


193 


1234 


2160 


2612 


784 9678 


194 


1235 








195 


1236 


2161 


2613 


787 2238 


196 


1237 








197 


1238 


2162 


2614 


787 8011 


198 


1239 








199 


1240 


2163 


2615 


784 9436 


200 


1241 


2164 


2616 


787 6897 


201 


1242 








202 


1243 








203 


1244 


2165 


2617 


790_1649 
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Table 9 



SEQ ID 
NO: of lull- 
lengtn 
nucieoiiue 
sequence 


r\f full 
NOt OI IlUl- 

lengui 

nonflnP 
pcpUUC 

con it pn OP 


oil v 

nucleotide 


SEO IT) 
NO: of 
contig 
neDtide 
sequence 


Identification of 

111 vll V* 

Priority ADDlication 
that contig nucleotide 
sequence was Hied 
(Attorney Docket 
No. SEQ ID NO.) * 


904 

ZU'T 


1245 








905 

ZvJ 


1246 


2166 


2618 


790 1664 


906 

ZOO 


1247 


2167 


2619 


790 1671 


Zv/ 


1248 


2168 


2620 


789 4182 


90R 
ZUo 


1249 


2169 


2621 ! 


787 3365 




1950 


2170 


2622 


790 24699 


91 n 

Z1U 


1951 








01 1 
Zl i 


1 959 


2171 


2623 


790 24002 


010 
Z1Z 


1951 








011 
Zi J 


1954 


2172 


2624 


790 1713 


01 A 


1955 








01 ^ 
Zl J 


1956 


2173 


2625 


790 12005 


^16 

ZIO 


1257 








917 

Z 1 / 


1258 


2174 


2626 


787 371 


918 

ZIO 


1259 


2175 


2627 


788 11375 


91Q 


1260 


2176 


2628 


792 6253 | 


990 


1261 


2177 


2629 


790 20480 


991 

ZZ1 


1262 








999 

ZZZ 


1263 


2178 


2630 


787 8084 


991 
zzo 


1264 








994. 


1265 


2179 


2631 


790 1787 


095 
ZZJ 


1266 


2180 


2632 


787 5659 


996 


12(57 


2181 


2633 


790 14480 


997 
zz / 


1268 


2182 


2634 


790 1801 


998 
ZZO 


1960 








99Q 

zzy 


1970 


2183 


2635 


790 22521 


910 
ZJU 


1971 


2184 


2636 


790 3633 


011 
ZD 1 


1979 
xz / z 








010 


1971 

JLZ / J 


2185 


2637 


787 5670 


011 
ZDj 


1974 
iz / *t 


2186 


2638 


790 20482 


91/1 


1975 








915 


1976 


2187 


2639 


790 6685 


916 
ZJO 


1977 


2188 


2640 


785 2624 


917 


1978 








918 


1279 








91Q 


1280 


2189 


2641 


787 6797 


940 


1281 


2190 


2642 


784 5046 


941 


' 1282 








949 

ZHZ 


1283 








941 


1284 








944 


1285 








945 

ZtJ 


1286 








046 


1987 








047 
Z*t/ 


1988 


2191 


2643 


784 6709 


948 


1289 








249 


1290 








250 


1291 


2192 


2644 


787 3930 


251 


1292 








252 


1293 


2193 


2645 


790 2982 


253 


1294 


2194 


2646 


790 2086 


254 


1295 
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Table 9 



SEQ ED 
NO: of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of full- 
length 
peptide 
sequence 


SEQ ED 
NO: of 
contig 
nucieonae 


SEQ ID 
INU: 01 
conng 

pcpuuc 


Identification ot 
r riorny Application 

that rnnticr nnolpfvHflp 
cpahptipp was filed 

9CUUCUVC Tt Aj lllbU 

f Attorney Docket 
No. SEQ ID NO.) * 


o<^ 
Z33 


1 006 








256 


1 007 

izy / 








257 


1 OOR 








258 


lOOO 


9105 


2647 


784 1280 


259 


1 inn 

1 3UU 








260 


i 7ni 

13U1 


0106 

Z1?U 


2648 


787 9953 


ZD1 


1 700 

1 3UZ 


01 07 


2649 


790 4258 ! 


262 


1 1A1 
13U3 


01 OR 

ziyo 




790 16925 


263 


1 7A/t 

13U4 


0100 

ziyy 




790 1256 


264 


1 1A*I 

13U3 


ZZUv 


9652 


788 6514 


265 


1300 








266 


130/ 








267 


1 1AQ 

13Uo 








268 


1 7AG 

13U9 








269 


1 7 1 A 

1310 








270 _j 


1111 
131 1 








271 


1710 
131Z 








OHO 

272 


1 31 3 


ZZU1 


2653 


787 2484 


273 


1714 
1314 


OOftO 

ZZUZ 


2654 


790 2283 


2/4 


13 13 










1 71 £ 
1310 


9901 


2655 


787 2505 


oo £ 
2/0 


1717 
131 / 


9904 


2656 


790 6292 


99*7 
2/ / 


1 71 R 
13 16 








2/o 


1710 

i3iy 








279 


1 79ft 
13ZU 


9905 


2657 


784 2332 

f u r ******** 


OOA 

280 


1 70 1 
1321 








281 


1 700 
1322 








282 


1 707 

1 3Z3 


9906 
zzuo 


9658 


790 2410 


283 


1 79A 
13Z4 


0907 
zzu / 


965Q 


790 6347 


284 


1 70< 

13Z3 


990R 
ZZUO 


9660 


790 12379 


285 


1 796 
13Z0 


99ft0 


2661 


790 2433 1 

f *f \J A* M m* +* 


286 


1 707 
13Z/ 


9910 
ZZ1U 


2662 


784 8177 


287 


1 708 
13Z5 


991 1 
ZZl 1 


2663 


790 2436 


ZOO 


1 700 

i jzy 








289 


1 77ft 

133U 








1AA 

290 


1771 








1A1 

291 


1 770 
133Z 


9919 

ZZ 1Z 


2664 


790 2469 


292 


1 ill 
1333 


991 7 
ZZl 3 


9665 


788 7 


293 


1 77/1 

1334 


9914 
ZZ14 


9666 


784 6493 


294 


1 77*C 

1335 








1AC 

295 


1 77< 

1330 








296 


1 777 

133 / 


991 5 
ZZl J 


2667 


790 2489 


29/ 


1 77Q 
1335 








298 


1 770 

i33y 








299 


1 7AA 
134U 


9916 
zziu 


9668 


790 8006 

/ JKt \J\J \J\J 


300 


1341 


2217 


2669 


787 2576 


301 


1342 


2218 


2670 


790 2537 


302 


1343 








303 


1344 


2219 


2671 


790 2542 


304 


1345 








305 


1346 
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Table 9 



l\in. nf full- 
nurlentidp 

UUVIvVUUv 

senuence 


NO* nf full- 
it v/« ui mil 

lpncrffi 

nentide 

seauence 


OAL/V^ JUL/ 

NO- of 
pontic* 
nucleotide 
sequence 


STCOTD 

uuy JJLf 

NO* of 
con tig 
peptide 
sequence 


Identification nf 
Priority Annlication 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


306 


1347 


2220 


2672 


784 1031 


307 


1348 








308 


1349 


2221 


2673 


787 3678 


309 


1350 








310 


1351 


2222 


2674 


787 1269 


311 


1352 


2223 


2675 


790 4055 


312 


1353 








313 


1354 








314 


1355 








315 


1356 








316 


1357 








317 


1358 


2224 


2676 


790 2683 


318 


1359 








319 


1360 








320 


1361 








321 


1362 








322 


1363 








323 


1364 








324 


1365 


2225 


2677 


784 2283 


325 


1366 


2226 


2678 


785 999 


326 


1367 








327 


1368 








328 


1369 


2227 


2679 


787 2690 


329 


1370 


2228 


2680 


787 10099 


330 


1371 








331 


1372 


2229 


2681 


787 2706 


332 


1373 


2230 


2682 


790 3751 


333 


1374 


2231 


2683 


787 9316 


334 


1375 


2232 


2684 


790 20358 


335 


1376 


2233 


2685 


784 5053 


336 


1377 








337 


1378 








338 


1379 


2234 


2686 


791 2711 


339 


1380 








340 


1381 


2235 


2687 


784 3427 


341 


1382 








342 


1383 


2236 


2688 


790 2178 


343 


1384 


2237 


2689 


790 1467 


344 


1385 








345 


1386 


2238 


2690 


784 6221 


346 


1387 


2239 


2691 


791 3194 


347 


1388 


2240 


2692 


790 2886 


348 


1389 


2241 


2693 


790 23660 


349 


1390 








350 


1391 








351 


1392 








352 


1393 








353 


1394 








354 


1395 








355 


1396 


2242 


2694 


784 1062 


356 


1397 


2243 


2695 


784 552 
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Table 9 



SEQID 
inu: of run- 
lengtu 

HUUtUUUc 


OVA TT* 

SEQ ID 

H5C\* full 

inu: oi lull- 

IcUgUl 

ripntirlp- 
ucp uuc 


01 

nucleotide 


NO* of 

Ill/i VJX 

contig 
peptide 


lUcIllllHaUUil Ui 

Prinritv A nnli cation 

that contig nucleotide 
sequence was filed 




cennence 


senuence 


sequence 


(Attorney Docket 
No. SEQID NO.) * 


357 


1398 


2244 


2696 


787 2790 


358 

J JO 


1399 


2245 


2697 


784 2232 


359 


1400 


2246 


2698 


785 231 




1401 


2247 


2699 


790 11073 


361 


1402 


2248 I 


2700 


790 2954 




1403 










1404 








164 


1405 








16*5 

JUJ 


1406 

1*T \J\J 








jOO 


1407 


2249 


2701 


789 6204 


367 

JU / 


1408 








168 


1409 








169 


1410 








370 


1411 


2250 


2702 


787 9215 


371 


1412 


2251 


2703 


789 4399 


372 

J / 


1413 


2252 


2704 


790 29004 


373 

J / J 


1414 


2253 


2705 


790 3053 


374 


1415 








375 


1416 








376 


1417 








377 


1418 


2254 


2706 


787 7446 


378 


1419 








379 


1420 








380 


1421 


2255 


2707 


784 2866 


381 


1422 


2256 


2708 


790 3129 


38? 


1423 








181 

JO J 


1474 








184 

JOt 


1475 

X*TX» J 


2257 


2709 


787 2844 


18<? 

Jo J 


1426 


2258 


2710 


790 7572 


186 


1427 


2259 


2711 


792 907 


387 

JO / 


1428 


2260 


2712 


785 396 


388 

JOO 


1429 








389 


1430 








390 


1431 








391 


1432 








392 


1433 








393 


1434 








394 


1435 


2261 


2713 


790 3197 


395 


1436 


2262 


2714 


790 26462 


396 


1437 








397 


1438 








398 


1439 








399 


1440 


2263 


2715 


790 3241 


400 


1441 


2264 


2716 


790 14778 


401 


1442 








402 


1443 








403 


1444 








404 


1445 


2265 


2717 


787 6238 


405 


1446 


2266 


2718 


784 2488 


406 


1447 








407 


1448 


1 2267 


2719 


784 9081 
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Table 9 



OT?A TT\ 

iSJUfy ID 
TVA« »f full 

1 ell gill 

11UL1CUUUC 

spa li An cp 


NYV nf full. 
liUt Ul lull* 

lo north 
ICligtll 

npntifif* 
sea ue nee 


NO of 
contig 
nucleotide 
sequence 


SEO ID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEO ID NO.) * 


408 


1449 


2268 


2720 


784 4949 


409 


1450 








410 


1451 








411 


1452 








412 


1453 








413 


1454 








414 


1455 








415 


1456 1 


2269 


2721 


784 5313 


416 


1457 








417 


1458 


2270 


2722 


784 8649 


418 


1459 








419 


1460 








420 


1461 


2271 


2723 


790 3503 


421 


1462 


2272 


2724 


790 10950 


422 


1463 


2273 


2725 


787 1829 


423 


1464 


2274 


2726 


785 845 


424 


1465 








425 


1466 


2275 


2727 


787 1830 


426 


1467 


2276 


2728 


787 2166 


427 


1468 


2277 


2729 


787 918 


428 


1469 


2278 


2730 


790 2695 


429 


1470 








430 


1471 


2279 


2731 


785 406 


431 


1472 








432 


1473 


2280 


2732 


790 12656 


433 


1474 


2281 


2733 


787 2938 


434 


1475 


2282 


2734 


784 1698 


435 


1476 








436 


1477 


2283 


2735 


787 931 


437 


1478 








438 


1479 


2284 


2736 


787 5985 


439 


1480 


2285 


2737 


787 3966 


440 


1481 


2286 


2738 


790 17389 


441 


1482 


2287 


2739 


787 1371 


442 


1483 


2288 


2740 


784 2299 


443 


1484 








444 


1485 








445 


1486 


2289 


2741 


790 15495 


446 


1487 








447 


1488 


2290 


2742 


787 2985 


448 


1489 








449 


1490 


2291 


2743 


790 4868 1 


450 


1491 








451 


1492 








452 


1493 


2292 


2744 


785 410 


453 


1494 








454 


1495 


2293 


2745 


784 3656 


455 


1496 








456 


1497 








457 


1498 








458 


1499 
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Table 9 



SEQ ID 
NO: of tull- 
lengtn 
nucleotide 


SEQ ID t 
1N<J: 01 IU11- 

1 Alt 4* V* 

lengtn 
pepuue 


diLQ 111 
r>U. 01 

COilllg 

miplpniiH a 

II UV1CU UUv 


NO- nf 

LUUIlg 
Uvll UUV 


.LUcIlUlHaLlUII UI 

Prioritv Amplication 
that con tip nucleotide 
sequence was filed 


sequence 


s cuucucc 


OvUUCUvv 


sequence 


(Attorney Docket 
No. SEQ ID NO.) * 


45Q 


1500 


2294 


2746 


790 17074 


460 


1501 








461 
401 


1502 








40Z 


1 JV/J 








461 

403 \ 


1 504 
1 jut 








AC. A ' 
404 , 


1 505 








403 


1 506 


2295 


2747 


790 6796 




1 S07 


2296 


2748 


784 8548 


467 
40/ 


1 50R 








468 
40o 


1509 








460 


1510 


2297 


2749 


787 4134 


470 

4 /U 


1511 








471 


1512 








479 
4 1 L 


1513 

X J X J 


2298 


2750 ! 


785 607 


473 


1514 

X »/ X*T 








474 

4/4 


1515 

1J X J 


2299 


2751 


784 4444 


475 


1516 








476 
4/0 


1517 

X J X / 








All 
4/ / 


151R 


2300 


2752 


785 609 


47 8 

4 / O 


1519 

1J17 


2301 


2753 


787 6219 


47Q 
4/ir 


1520 


2302 


2754 


790 20198 


480 

40U 


1521 








481 

40 1 


1522 


2303 


2755 


789 5808 


489 
4oZ 


1 593 








481 
4oj 


1 594 


2304 


2756 


790 21362 


484 
464 


1 595 








48^ 
4oD 


1 596 








486 
450 


1 597 








487' 
4o/ 


1 ^98 


9305 


2757 


790 8539 


488 
4oo 


1 590 








480 

4oy 


1530 


2306 


2758 


790 14555 


400 


1531 








401 


1532 








4Q9 


1533 


2307 


2759 


790 17165 


493 


1534 


2308 


2760 


789 5563 


404 


1535 

X -J -J -J 








405 


1536 








406 


1537 


2309 


2761 


788 10803 


407 
*ty / 


1538 


2310 


2762 


790 1392 


408 


1539 








4Q0 
*tyy 


1540 








500 


1541 










1542 










1543 


2311 


2763 


790 26265 


^01 


1544 








504 


1545 








505 


1546 








506 


1547 








507 


1548 


2312 


2764 


790 14264 


508 


1549 








509 


1550 
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Table 9 



sfo m 


OJCjV^ -LLP 




wo m 




NO: of full- 


NO* of full- 

i^vf* ui iiiu 


NO: of 


NO: of 


Prioritv Amplication 


length 


length 


con tig 


contig 


that contig nucleotide 


nucleotide 


Denticle 


nucleotide 


peptide 

r v r 


sequence was filed 


sequence 


sequence 


sequence 


sequence 


(Attorney Docket 










No. SEQ ID NO.) * 


510 


1551 








511 


1552 








512 


1553 


2313 


2765 


787 419 


513 


1554 


2314 


2766 


791 2696 


514 


1555 








515 


1556 








516 


1557 


2315 


2767 


785 1450 


517 


1558 


2316 


2768 


787 4026 


518 


1559 








519 


1560 


2317 


2769 


790 12340 


520 


1561 








521 


1562 








522 


1563 


2318 


2770 


790 13247 


523 


1564 


2319 


2771 


790 10245 


524 


1565 


2320 


2772 


787 1017 


525 


1566 


2321 


2773 


790 23263 


526 


1567 


2322 


2774 


790_16427 


527 


1568 








528 


1569 


2323 


2775 


789 5186 


529 


1570 


2324 


2776 


790 30441 


530 


1571 


2325 


2777 


789 3709 


531 


1572 


2326 


2778 


790 18037 


532 


1573 








533 


1574 


2327 


2779 


785 764 


534 


1575 








535 


1576 


2328 


2780 


789 5283 


536 


1577 - 


2329 


2781 


790 22045 


537 


1578 


2330 


2782 


789 2553 


538 


1579 


2331 


2783 


790 16254 


539 


1580 


2332 


2784 


785 3340 


540 


1581 


2333 


2785 


789 1599 


541 


1582 


2334 


2786 


784 2310 


542 


1583 


2335 


2787 


790 4114 


543 


1584 


2336 


2788 


790J2511 


544 


1585 








545 


1586 








546 


1587 








547 


1588 








548 


1589 


2337 


2789 


788 11639 


549 


1590 








550 


1591 








551 


1592 


2338 


2790 


790 14073 


552 


1593 








553 


1594 


2339 


2791 


790 27205 


554 


1595 








555 


1596 








556 


1597 


2340 


2792 


790 4994 


557 


1598 


2341 


2793 


790 6212 


558 


1599 


2342 


2794 


787 8231 


559 


1600 








560 


1601 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of full- 
length 
peptide 
sequence 


SEQID | 
NO: of 
contig 
nucleotide 
sequence 


SEQID 
NO: 01 
contig 
pepuue 
seqiicDLc 


Idenntication 01 
xTioriiy Appucauon 
id a 1 conug nucicuuuc 

coniiAnrp woe filpH 
scUUClltC Vraa uicu 

( A tf nr npv Docket 

No SEOIDNO.) * 


C/C1 

561 


louz 








C/CO 

562 


1 /cai 
10U3 








C/C1 

563 


10U4 








564 


1 /CAC 

10U5 




2795 


789 3199 


565 


i /ca< 
loUo 






784 1039 


ccc 

566 


1 /CAT 

iou/ 








567 


1 /CAQ 
lOUo 








ceo 

568 


1 /CAQ 

louy 








569 


101U 








570 


loll 








571 


1 /Ci o 

lolz 




£ 1 y 1 


784 9353 


C70 

572 


1 /CI "J 
1013 








573 5 


1 A 




2798 


790 29553 


574 


101D 








ct< 
575 


1010 


9 047 


2799 


787 669 


CTiT 

576 


1 £1 T 
101 / 








ctt 

577 


1 Q 

lOlo 


90/18 


2800 


790 4880 


578 


1 /CI o 

1019 


OOvlO 


2801 


784 2473 


579 


1 /COA 

102U 




9802 


791 3397 


con 

580 


1 /CO 1 

1021 








CO 1 

581 


1 /COO 

1022 








582 


1 /COO 

1023 




9801 


787 6211 


coo 

583 


1 /CO/1 

1024 








CO A 

584 


1 /CO< 

1025 








coc 

585 


1 /CO/C 

1020 


900 


9804 


790 19650 


con 

586 


1 /COT 

102/ 








corf 

587 


1 /COO 

102o 








coo 

588 


1 /COO 

102y 








coo 

589 


1 /CIA 

103U 








590 


1 /C1 1 

1031 








CO 1 

591 


1 /COO 

1032 








coo 

592 


1 /coo 
1033 








COO 

593 


1034 








594 


1 A0< 
103D 








595 


1£ox 


9^51 


2805 


788 1109 


596 




9^54 

ZjJH 


2806 


790 12340 ! 


COT 

597 


10 Jo 








coo 

598 


1 £0Q 

i03y 








599 


104U 


97^ 


2807 


790 16631 


600 


1 /C/l 1 

1041 


ZjjO 


9808 


784 3763 


601 


1642 








^/\o 

602 


1043 








603 


1 a a a 
1644 








604 


1645 








605 


1646 








606 


1647 








607 


1648 








608 


1649 








609 


1650 








610 


1651 








611 


1652 
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Table 9 



SEQID 
NO: 01 lull- 
lengtn 
nucieonae 


SEQ ID 
xnu: 01 IU11- 
lengtn 
pepiiue 


own 1x1 
kj/V of 

miplpfifidi* 

U IXCICt/ UUv 


CTTO TTI 
NO* nf 

lUIlllg 

npntide 


JLUCUllllCallUll Ul 

Priority Amplication 
that con tip nucleotide 
sequence was filed 


sequence 




SCUUVllvv 


sea ue nee 


(Attorney Docket 
No. SEQ ID NO.) * 


619 


1653 








61^ 


1654 








614 


1655 

IUJJ 








61 5 
O i J 


1656 








616 


1657 








617 1 


1658 








61 R 
Olo 


16SQ 


2357 


2809 


790 24903 


61Q 


1 660 


2358 


2810 


785 2185 j 


69 o 

OZU 


1661 








691 


1662 








699 
oz^ 


1663 


2359 


2811 


790 20271 


693 


1664 

1 Uvrr 








694 


1665 

1UUJ 








695 

OZ.J 


1666 

1UUU ! 








696 


1667 








697 


1668 








628 


1669 








629 


1670 


2360 


2812 


790 14778 


630 


1671 








631 


1672 








639 


1673 

AVI / J 








633 


1674 

X v / ■? 








674 


1675 

1U / J 










1676 








676 
OjO 


1677 








03 / 


167R 
10/0 








67 R 


167Q 








670 


1680 

1 I/O Vs 








640 


1681 

1UO 1 








641 
0*r 1 


1682 


2361 


2813 


790 12348 


649 


1683 








643 


1684 








644 


1685 

IvOJ 








645 


1686 








646 
u*tu 


1687 


2362 


2814 


790 667 


647 


1688 

1VOO 


2363 


2815 


787 4774 


648 


1689 


2364 


2816 


784 4739 


64Q 


1690 








650 


1691 


2365 


2817 


785 2741 


651 


1692 








652 


1693 








653 


1694 








654 

UJt 


1695 








655 

OJJ 


16Q6 


2366 


2818 


787 10308 


656 

OJO 


1607 








657 


1698 








658 


1699 


2367 


2819 


790 13971 


659 


1700 








660 


1701 








661 


1702 


2368 


2820 


790 1314 


662 


1703 


2369 


2821 


788 6944 i 
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Table 9 



SEQ ID 
INO: of iull- 
length 
nucieoiiae 


SEQ ID 

XI . rtf full 

inu: oi mil- 
lengin 


ojLki ID 

MA. n f 
L\SJ* 01 

4* initio* 
niirlpfttide 

1 J UVlvU uut 


NO- of 

contig 

neotide 


THpntifif*£)Hfiri nf 

lUCUilllKlLlwll Ul 

Priori tv Ann! i cation 
that contig nucleotide 
sequence was filed 




cpniiPTirp 


sentience 


sequence 


(Attorney Docket 
No. SEQID NO.) * 


663 


1704 


2370 


2822 


790 2750 


664 


1705 


2371 


2823 


787 9604 


665 


1706 


2372 


2824 


784 3541 


666 


1707 








667 


1708 


2373 


2825 


790 20829 




1709 


2374 


2826 


789 1765 


660 

\J w>7 


1710 








u / w 


1711 








671 


1712 


2375 


2827 


784 1088 


67? 


1711 








671 

U / -> 


1714 

1 1 XT 








674 

U /*T 


1715 








67 S 


1716 

1 / X w 








676 
w / u 


1717 








677 


1718 








678 


1719 ! 








679 


1720 








680 


1721 








681 


1722 








68? 


1723 


2376 


2828 


791 4325 


681 


1724 








684 


1725 








68S 


1726 








686 


1727 


2377 


2829 


790 17256 


687 


1728 


2378 


2830 


790 6038 


688 

uoo 


1729 








689 


1730 








6Q0 


1731 








601 


1732 


2379 


2831 


784 1490 


60? 


1733 








601 


1734 








604 


1735 








60^ 


1736 








606 


1737 


2380 


2832 


784 1639 


697 
w^ / 


1738 








698 


1739 








699 


1740 


2381 


2833 


790 3738 


700 

/ WW 


1741 








701 


1742 








70? 


1743 








701 


1744 








704 


1745 








70S 


1746 








706 

/ ww 


1747 








707 

/ w / 


1748 


2382 


2834 


784 4929 


708 


1749 


2383 


2835 


790 28014 


709 


1750 








710 


1751 


2384 


2836 


792 6483 


711 


1752 








712 


1753 








713 


1754 
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Table 9 



\l MJJ 


Giro m 


Giro i~n 


SEOED 


Identification of 


NO- nf full- 


NO* of full- 


NO: of 


NO: of 


Priority Application 


lpnottl 
ICUgUl 


length 


contig 


contig 


that contig nucleotide 


nucleotide 


DeDtide 


nucleotide 


peptide 

Mr Mr 


sequence was filed 


sea ue nee 


sequence 


sequence 


sequence 


(Attorney Docket 






No. SEOroNO.) * 


714 


1755 


2385 


2837 


790 15616 1 


715 


1756 








716 


1757 








717 


1758 








718 


1759 








719 


1760 


2386 


2838 


784 1755 


720 ' 


1761 








721 


1762 








722 


1763 








723 


1764 








724 


1765 








725 


1766 








726 


1767 








727 


1768 








728 


1769 








729 


1770 








730 


1771 








731 


1772 








732 


1773 


2387 


2839 


784 3304 


733 


1774 


2388 


2840 


785 2998 


734 


1775 








735 


1776 


2389 


2841 


790 5241 


736 


1777 


2390 


2842 


787 6489 


737 


1778 


2391 


2843 


790 29981 


738 


1779 








739 


1780 








740 


1781 








741 


1782 


2392 


2844 


790 6347 


742 


1783 


2393 


2845 


790 14685 


743 


1784 








744 


1785 








745 


1786 


2394 


2846 


787 10117 


746 


1787 








747 


1788 








748 


1789 


2395 


2847 


787 1056 


749 


1790 








750 


1791 


2396 


2848 


785 1047 


751 


1792 


2397 


2849 


791 419 


752 


1793 


2398 


2850 


787 3759 


753 


1794 








754 


1795 


2399 


2851 


785 3304 


755 


1796 








756 


1797 


2400 


2852 


784 4056 


757 


1798 








758 


1799 


2401 


2853 


790 2255 


759 


1800 








760 


1801 








761 


1802 








762 


1803 


2402 


2854 


787 4393 


763 


1804 








764 


1805 
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Table 9 



INTO* of full- 
ri\J* 01 IUU- 

lengui 

n n rl poti d e 


NO* of full- 
length 
peptide 
sequence 


NO: of 
contie 
nucleotide 
sequence 


SEQID 
NO: of 
contig 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEOIDNO.) * 


765 


1806 


2403 


2855 


784 3297 


766 


1807 








767 


1808 








768 


1809 


2404 


2856 


784 3609 1 


769 


1810 








770 


1811 








771 


1812 


2405 


2857 


792 6026 


111 


1813 


2406 


2858 


787 9972 1 


1 1 J 


1814 








774 


1815 








775 


1816 








776 


1817 








777 


1818 








778 


1819 








779 


1820 


2407 


2859 


785 1351 I 


780 


1821 








781 


1822 


2408 


2860 


791 3196 


782 


1823 


2409 


2861 


790 25408 , 


783 


1824 


2410 


2862 


784 3960 


784 


1825 


2411 


2863 


787 4591 


785 


1826 


2412 


2864 


784 4366 


786 


1827 








787 


1828 


2413 


2865 


785 3201 


788 


1829 


2414 


2866 


784 360 


789 


1830 


2415 


2867 


785 1913 


790 


1831 


2416 


2868 


789 2627 


791 


1832 








792 


1833 








793 


1834 








794 


1835 




• 




795 


1836 








796 


1837 








797 


1838 


2417 


2869 


790 2077 


798 


1839 


2418 


2870 


790 19187 


799 


1840 


2419 


2871 


789 3760 


800 


1841 


2420 


2872 


784 6919 


801 


1842 








802 


1843 


2421 


2873 


784 1456 


803 


1844 








804 


1845 








805 


1846 


2422 


2874 


784 5322 


806 


1847 


2423 


2875 


790 1305 


807 


1848 








808 


1849 








809 


1850 








810 


1851 








811 


1852 








812 


1853 








813 


1854 








814 


1855 


2424 


2876 


790 21839 


815 


1856 
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Table 9 



wo m 

oJjy LU 




sum m 

oJL\^ LU 


OJCj\£ LU 


luenuiicauon 01 


NO* of full- 

1 1 V.' • VI IUU 


NO* nf full- 

liV/i Ul 1 1111*" 


NO* of 


NO- of 


Prinritv Annliratirm 


length 


length 


contig 


contig 


that cnntifF mirlpntirfp 


nucleotide 


nentide 


nucleotide 


peptide 


sequence was filed 


sequence 


sequence 


sequence 


sequence 


(Attorney Docket 










No, SEQ ID NO.) * 


816 


1857 








817 


1858 








818 


1859 


2425 


2877 


790 20653 


819 


1860 








820 


1861 


2426 


2878 


784 8235 


821 


1862 


2427 


2879 


792 7381 


822 


1863 








823 


1864 


2428 


2880 


784 2446 


824 


1865 


2429 


2881 


787 5610 


825 


1866 








826 


1867 








827 


1868 


2430 


2882 


787 8030 


828 


1869 








829 


1870 








830 


1871. 


2431 


2883 


784 287 


831 


1872 


2432 


2884 


785 2857 


832 


1873 








833 


1874 








834 


1875 








835 


1876 








836 


1877 


2433 


2885 


787 7849 


837 


1878 


2434 


2886 


788_4268 J 


838 


1879 








839 


1880 








840 


1881 








841 


1882 








842 


1883 








843 


1884 








844 


1885 


2435 


2887 


784 3976 


845 


1886 


2436 


2888 


788 13658 


846 


1887 








847 


1888 








848 


1889 


2437 


2889 


784 5652 


849 


1890 


2438 


2890 


784 6881 


850 


1891 


2439 


2891 


784 344 


851 


1892 








852 


1893 








853 


1894 








854 


1895 








855 


1896 








856 


1897 








857 


1898 








858 


1899 


2440 


2892 


790 1219 


859 


1900 


2441 


2893 


790 19855 


860 


1901 








861 


1902 


2442 


2894 


784 4089 | 


862 


1903 


2443 


2895 


787 4525 


863 


1904 








864 


1905 








865 


1906 


2444 


2896 


791^ 14 


866 


1907 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQID 
NO: of fuU- 
length 
peptide 
sequence 


C*1?f\ TT\ 

SEQ ID 
xnu: oi 
conug 

UUClcUUllC 
SClJUClUC 


iiu: oi 

tUUUg 

npntiHp 
JJCLJUUC 

9CI£UCX1VG 


lu en uii canon oi 
x rioriiy ivppncauoii 

llldl tuuug 11UUCUUUC 
ocuuvuvc Traj xxxvu 

T Attnrnev Docket 
No, SEOEDNO.) * 


867 


1008 

15/V/O 








868 
505 


1QOQ 








005/ 


15/1U 


2445 


2897 


792 8447 


870 


101 1 








871 
5/1 


1Q19 








879 
5 /Z 


1Q1 3 


7446 


2898 


790 12289 


873 


1014 








874 


1015 

15/U 


7447 

Z*T*T / 


2899 


791 938 


875 
o I J 


101 6 
15/10 


9448 


9900 


787 2708 


87A 
5 /O 


1Q1 7 
15/1 / 


9440 


9901 


790 28624 


877 
oil 


1018 
15/10 








878 

5 /5 


1010 








87Q 

o /y 


1070 








880 


1071 


2450 


2902 


790 9414 


881 


1992 








887 


1993 








883 


1924 








884 

00*T 


1925 


2451 


2903 


790 29172 


885 

OOJ 


1926 


2452 


2904 


785 1259 


886 
ooo 


1097 








887 
55 / 


1098 
15/zo 


9453 


2905 


790 11594 


888 
555 


1090 

15/Z.5/ 


?454 


2906 


790 4305 


880 
55? 


1030 
15/jU 


9455 


2907 


792 4498 


80H 
55/U 


1Q31 
150 1 








801 
55/1 


1Q39 








807 
55/z 


1 Q33 
15/ J j 








803 
55/0 


1Q34 
15/jH 








804 

oyH 


1Q35 
1 7J J 








805 

550 


1Q36 

1 5/jO 








8Q6 

55/0 


1037 

15/ J / 


9456 


2908 


790 2984 


807 
55/ / 


1038 

15/JO 








8Q8 
55^5 


1030 

15/Ji/ 


9457 


2909 


790 11010 


8Q0 
55/5/ 


1040 


2458 


2910 


790 21318 


onn 
5/uu 


1041 


2459 


2911 


790 3969 


001 


194? 


2460 


2912 


785 3697 


00? 


1943 


2461 


2913 


785 3750 


903 


1944 

17TT 


2462 


2914 


787 10293 


904 

ywt 


1945 


2463 


2915 


787 5468 


905 


1946 








006 


1947 
a y^ i 


2464 


2916 


784 4027 


Q07 
5/U/ 


1048 

12/*rO 








008 


1040 


9465 

Z*tOJ 


2917 

z*y x / 


791 1076 


OOO 

5/u5/ 


1Q50 
15/ 


9466 


9918 
t*y x o 


790 14655 


Q10 
5/1U 


1 051 
15/ J 1 








01 1 
5/11 


1059 
15/jZ 


9467 


9919 
£*y iy 


788 11281 


912 


1953 


2468 


2920 


784 3554 


913 


1954 


2469 


2921 


784 6827 


914 


1955 








915 


1956 








916 


1957 








917 


1958 


2470 


2922 


789 4549 
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Table 9 



SEQ ID 


SEQ Ill 


ahjKl JUL/ 




lueniiiicanoD oi 


JNU: oi lull- 


VA, ~f full 

INLJ. OI lull- 


PIKJ. 01 


NO* nf 
IN IF* OI 


T*i*inritv Amplication 


lengin 


Inn rr^Vi 

iengin 


CUllUg 


f Anfio 
tuiiug 


that contip nucleotide 


IlUucUuUc 




niiclpntirif* 

UUUvuUUC 


nentide 


sequence was filed 


SvU UvULC 


oCUUvUvb 


sea uen ce 


sequence 


(Attorney Docket 






No. SEQ ID NO.) * 


918 


1959 








919 


1960 


2471 


2923 


790 948 


920 


1961 








921 
y** i 


1962 


.2472 


2924 


789 682 


922 


1963 


2473 


2925 


787 2281 _J 


923 


1964 








924 


1965 


2474 


2926 


790 11999 


99 S 


1966 


2475 


2927 


790 28325 


996 


1967 


2476 


2928 


790 7793 


Q97 


i7UO 


2477 


2929 


792 3501 


99 R 


1969 








929 

y**y 


1970 


2478 


2930 


790 4547 


910 


1971 

x y i x 


2479 


2931 


788 5864 


911 
yj i 


1972 








912 1 


1973 


2480 


2932 


790 24604 I 


911 


1974 

xy t~r 








934 


1975 


2481 


2933 


790 25716 


915 

7JJ 


1976 


2482 


2934 


785 1851 


916 


1977 


2483 


2935 


785 1852 


937 
✓j / 


1978 


2484 


2936 


785 1155 


938 


1979 


2485 


2937 


785 3352 


939 

yjy 


1980 








940 


1981 


2486 


2938 


785 1297 


941 


1982 


2487 


2939 


785 477 


942 


1983 


2488 


2940 


785 2441 


943 


1984 


2489 


2941 


785 1294 


944 


1985 








945 


1986 








946 


1987 

170 / 








947 


1988 


2490 


2942 


789 4549 


948 

7tO 


1989 

xyoy 


2491 


2943 


784 6979 


949 

7*T7 


1990 


2492 


2944 


784 8567 


950 


1991 


2493 


2945 


790 14286 


951 


1992 


2494 


2946 


784 8986 


952 


1993 








953 


1994 


2495 


2947 


790 12510 


954 


1995 








955 

yjj 


1996 








956 


1997 








957 
✓j / 


1998 

xyy v 


2496 


2948 


787 3623 


958 


1999 

xyyy 








959 

y%ty 


2000 








960 


2001 








961 


2002 


2497 


2949 


792 4842 


962 


2003 


2498 


2950 


784 9156 


963 


2004 








964 


2005 








965 


2006 








966 


2007 


2499 


2951 


784 2649 


967 


2008 


2500 


2952 


785 544 


968 


2009 


2501 


2953 


787 4148 
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Table 9 



SEQID 
NO: of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of full- 
lengtn 
pepuae 
sequence 


SEQID 
ISO: oi 
conug 

nUClcULlLlC 




j.uenuiii'diiuii ui 
"Prinritv Annliration 

that con tig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEQ ID NO.) * 


OAO 

yoy 


9010 








y /u 


901 1 


9509 


2954 


784 5145 


y / 1 


901 9 








y i a 


9011 


2501 


2955 


784 919 


071 

y 


9014- 








074 

y /*f 




2504 


2956 


787 2532 


075 

y /-> 


9016 


2505 


2957 


788 13689 


Q76 
y /o 


9017 








077 

y/ / 


9018 

^UlO 


2506 


2958 


784 2950 


078 

y /o 


901 Q 








Q70 

y /y 


9090 








you 


9091 


2507 


2959 


784 4027 


OKI 

yoi ! 


9099 


9508 


2960 


785 332 


089 

yoz 


9091 








yoj 


9094 








084 


9095 


2509 


2961 


784 1944 




7.026 


2510 


2962 


787 6916 


QR6 




2511 


2963 


787 2539 


087 

yo / 


9098 








OSS 

yoo 


909Q 


2512 


2964 


787 10243 


Q8Q 

yoy 


9010 








oon 
yyu 


9011 








ooi 
yy i 


9019 


9511 


2965 


787 5673 


009 
yyz 


9011 








001 
yyj 


9014 








004 

yy*f 


9015 








0Q5 

yy«> 


9016 








yyo 


9017 








OQ7 

yy / 


9018 

Z\J JO 








QQQ 

yyo 


9019 


2514 


2966 


787 2168 


OQO 

yyy 


2040 


2515 


2967 


784 1151 


i ooo 


9041 








lUvi 


9049 








1009 


9041 


2516 


2968 


787 3680 


1001 


9044 

iVTT 


2517 


2969 


787 5181 


1004 


9045 


2518 


2970 


787 3356 


1005 


9046 


2519 


2971 


785 254 




2047 








1007 


9048 








1008 


9040 


2520 


2972 


789 1109 


1000 


9050 








1010 


9051 








101 1 


9059 


2521 


2973 


790 7032 


101 9 
1 viz 


9051 


2592 


2974 


791 4111 


101 1 


9H54 








1014 


2055 








1015 


2056 


2523 


2975 


790 11262 


1016 


2057 


2524 


2976 


787 2040 


1017 


2058 








1018 


2059 








1019 


2060 
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Table 9 



CJpA 111 

diiil^ JXI 

xm. A f full 
IN LI! OI lull"- 

lengiii 

UUUcUllUc 

SCUUCIIVC 


cirri m 

TSJC\. of full- 
Ion trill 

leiigui 
senuence 


NO* nf 
rnnfio 

\>UUUg 

nucleotide 
sequence 


STCO ID 
NO of 
contdg 
peptide 
sequence 


Identification of 
Priority Application 
that contig nucleotide 
sequence was filed 
(Attorney Docket 
No. SEOroNO.) * 


1020 


2061 








1021 


2062 


2525 


2977 


785 1902 


1022 


2063 


2526 


2978 


790 12167 


1023 


2064 








1024 


2065 








1025 


2066 








1026 


2067 








1027 


2068 


2527 


2979 


784 9027 


1028 


2069 


2528 


2980 


790 8294 


1029 


2070 








1030 


2071 


2529 


2981 


784 5029 


1031 


2072 


2530 


2982 


784 3541 


1032 


2073 








1033 


2074 


2531 


2983 


787 5870 


1034 


2075 








1035 


2076 


2532 


2984 


787 2733 


1036 


2077 


2533 


2985 


785 581 


1037 


2078 


2534 


2986 


787 9345 


1038 


2079 








1039 


2080 








1040 


2081 








1041 


2082 









*784_XXX = SEQ ID NO: XXX of Attorney Docket No. 784, US Serial No. 09/488,725 
filed 01/21/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

785_XXX = SEQ ID NO: XXX of Attorney Docket No. 785, US Serial No. 09/491,404 
filed 01/25/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

787_XXX = SEQ ID NO: XXX of Attorney Docket No. 787, US Serial No. 09/496,914 
filed 02/03/2000, the entire disclosure of which, including sequence fisting, is 
incorporated herein by reference. 

788_XXX = SEQ ID NO: XXX of Attorney Docket No. 788, US Serial No. 09/515,126 
filed 02/28/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

789_XXX = SEQ ID NO: XXX of Attorney Docket No. 789, US Serial No. 09/5 19,705 
filed 03/07/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

790_XXX = SEQ ID NO: XXX of Attorney Docket No. 790, US Serial No. 09/540,217 
filed 03/31/2000, the entire disclosure of which, including sequence fisting, is 
incorporated herein by reference. 
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791_XXX = SEQ ID NO: XXX of Attorney Docket No. 791, US Serial No. 09/552,929 
filed 04/18/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 

» 

792_XXX = SEQ ID NO: XXX of Attorney Docket No. 792, US Serial No. 09/577,408 
filed 05/18/2000, the entire disclosure of which, including sequence listing, is 
incorporated herein by reference. 
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Table 10 



oJCil^ MJJ ry\J 01 1* Ull-lcUglU 
rNUCicUUtlC OclJUCilLC 


SRO ID NO of Full-leneth 
Pentide Seauence 


SEQ ID NO in 
Priority Application 
USSN 60/311,261 








1 
1 


1042 


1 i 


o 
z 


1043 


2 


-1 

D 


1044 


3 


A 

*T 


1045 


4 




1046 ! 


5 


i 


1047 


6 ] 


7 


1048 


7 


o 
a 


1049 


8 


Q 


1050 


9 


in 


1051 


10 


1 1 
1 1 


1052 


11 


19 


1053 


12 


13 
ID 


1054 


13 


14 


1055 


14 


15 
x«s 


1056 


15 


16 


1057 


16 


17 


1058 


17 


IX 


1059 


18 


1Q 


1060 


19 


ZU 


1061 


20 


91 

Zl 


1062 


21 ' 


99 


1063 


22 


91 


1064 


23 


94 


1065 


24 


9^ 


1066 


25 


96 
ZO 


1067 


26 


97 
Z / 


1068 


27 


98 


1069 


28 


90 
Z5r 


1070 


29 


10 


1071 


30 


11 
D 1 


1072 


31 


19 

jZ 


1073 


32 


11 


1074 


33 




1075 


34 




1076 


35 




1077 


36 


37 


1078 


37 


JO 


1079 


38 


1Q 


1080 


39 


40 


1081 


40 


41 


1082 


41 


49 

HZ 


1083 


42 


*tD 


1084 


43 


AA 


1085 


44 




1086 

X V/OU 


45 


46 


1087 


46 


47 


1088 


47 


48 


1089 


48 


49 


1090 


49 


50 


1091 


50 


51 


1092 


51 


52 


1093 


52 
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Table 10 



dJiiV i^i d 01 Jjuiwengm 

Mii rlpnti Hp SpmiPfiri* 


SEO ID NO of Full-leneth 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 60/31U61 


Jj 


1094 


53 




1095 


54 ! 


J J 


1096 


55 


^A 
JO 


1097 


56 


57 


1098 


57 


5C 
Do 


1099 


58 


^O 

jy 


1100 


59 


<n 
ou 


1101 


60 


ai 

Ol 


1102 


61 


0/ 


1103 


62 


A7 


1104 1 


63 


AA 
04 


1105 


64 


A5 


1106 


65 


AA 


1107 


66 Z] 


A7 
0/ 


1108 


67 


05 


1109 


68 


AO 

oy 


1110 


69 


70 
l\) 


1111 


70 


71 


1112 


71 


79 


1113 


72 




1114 


73 


74 


1115 


74 


75 


1116 


75 


7A 
/o 


1117 


76 


77 


1118 


77 


7ft 
/o 


1119 


78 


70 

/y 


1120 


79 


fin 


1121 


80 


ol 


1 122 


81 


5Z 


1123 


82 


Q1 


1124 


83 


CA 


1125 


84 


C5 
5j 


1126 


85 


CA 
50 


1127 


86 


R7 
o / 


1128 


87 


CC 
55 


1129 


88 


CO 

sy 


1130 


89 


on 
yu 


1131 


90 j 


01 

y l 


1132 


91 


07 

yz 


1133 


92 


o^ 

yj 


1134 


93 


OA 

y*f 


1135 


94 


05 

yj 


1136 


95 1 




1137 


96 


07 

y i 


1138 


97 


jrO 


1139 


98 


99 


1140 


99 


100 


1141 


100 


101 


1142 


101 


102 


1143 


102 


103 


1144 


103 


104 


1145 


104 


105 


1146 


105 
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SEO ED NO of Full-length 
Nucleotide Seouence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ro NO in 
Priority Application 
USSN 60/311,261 


106 


1147 


106 j 


107 


1148 


107 


108 1 


1149 


108 


109 


1150 


109 


110 


1151 


110 


111 


1152 


111 


112 


1153 


112 


113 


1154 


113 


114 


1155 


114 


115 


1156 


115 


116 


1157 


116 


117 


1158 


117 


118 


1159 


118 


119 


1160 


119 


120 


1161 


120 


121 


1162 


121 


122 


1163 


122 


123 ' 


1164 


123 


124 


1165 


124 


125 


1166 


125 


126 


1167 


126 


127 


1168 


127 


128 


1169 


128 


129 


1170 


129 


130 


1171 


130 


131 


1172 


131 


132 


1173 


132 


133 


1174 


133 


134 


1175 


134 


135 


1176 


135 


136 


1177 


136 


137 


1178 


137 


138 


1179 


138 


139 


1180 


139 


140 


1181 


140 


141 


1182 


141 


142 


1183 


142 


143 


1184 


143 


144 


1185 


144 


145 


1186 


145 


146 


1187 


146 


147 


1188 


147 


148 


1189 


148 


149 


1190 


149 


150 


1191 


150 


151 


1192 


151 


152 


1193 


152 


153 


1194 


153 


154 


1195 


154 


155 


1196 


155 


156 


1197 


156 


157 


1198 


157 


158 


1199 


158 
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SEQ ID NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Jpull-lengtn 
reptiae sequence 


irrioniy Appucaiion 


1 <o 
159 


10AA 
1ZUU 




loO 


10A1 
1ZU1 




1 £1 

161 


10AO 
1ZUZ 


lOl 


1 /CO 

162 


1 OA1 
120 J 


160 


i ^o 
163 


1 OA/I 

1204 


Ifil 

lOJ 


164 


1 OA^ 

1205 


1 £4 
10*f 


165 


1 OA^C 

1206 


1 /C< 
10D 


166 


1 OA*7 

1207 


1 A< 

loo 


167 


1208 


1 £1 
10/ 


168 


1209 


1 /CO 

loo 


169 


1 O 1 A 

1210 


ioy 


170 


1111 
1211 


1 /u 


171 


1212 


1 7 1 
1/1 


172 


1 0 1 O 

121 J 


1 70 
1 /Z 


1 HI 

173 


1Z14 


1 71 


174 


101^ 
1Z13 


MA 


1 nc 
175 


1 01 a 
1Z10 


17S 


176 


1 01 7 

121 / 


17£ 
I/O 


177 


101 fi 

IZlo 


177 
1 / / 


1 OO 

178 ! 


1 01 o 

i2iy 


178 

I/O 


1 OA 

179 


1 OOA 

122U 


17Q 

i /y 


1 OA 

180 


1 OO 1 

122 1 


1 cn 

loU 


1 O 1 

181 


1 ooo 
1ZZZ 


181 


1 oo 
182 


1 ooo. 
12Z3 


180 
loZ 


i OO 

183 


1 OO/I 

1224 


1 81 
loo 


1 OA 

184 


1 oo c 
1225 


1 8A 


185 


1 OO £. 

1226 


1 

lo5 


186 


1227 


1 C< 

loo 


187 


1228 


1 C7 

lo / 


188 


1 OO A 

1229 


1 QQ 

loo 


189 


1 OOA 

1230 


1 CO 

ioy 


190 


i o*i 1 

123 1 


1 on 
190 


191 


1 ooo 

1232 


101 

iy i 


1 rvo 

192 


1 ooo 
12JJ 


100 

iyz 


1 ao 
193 


1 Ol/l 
1234 


101 
ISO 


1 CM 

194 


1 OK 
1235 


104 


1 nc 
195 


1 01£ 
1230 


iso 


190 


1Z3 / 




1 AO 

19/ 


1 018 
1Z35 


107 


1QO 

19o 


1 010 

izjy 


1QR 


100. 
199 






200 


1 0/11 
lZ*tl 


zuu 


OA1 

201 


1 0AO 
1242 


om 


OAO 

202 


1 0/tl 
1243 


ono 
zuz 


203 


lO/M 

1244 


Oftl 


204 


1245 


Oft/1 




1246 


205 


206 


1247 


206 j 


207 


1248 


207 


208 


1249 


208 


209 


1250 


209 


210 


1251 


210 


211 


1252 


211 
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SEQ ID NO of Full-Iengtn 
Nucleotide Sequence 


CPA m MO aC T7iiI1 lan#rth 

oILVJ JLU l>vJ 01 ruii-iengui 
r epiiae sequence 


oH»V^ IXr liv/ III 
x riuniy >vjp tiuu 
TTWN 60/111 261 

viJull l)*vl 


91 0 
Z1Z 


19S1 


212 


on 

Zlo 


1954 


on 


01 A 
ZI4 


1 955 


714 


015 
Z10 


19^6 


91S* 


01 £ 
Z10 


19*57 

1ZJ / 


216 


0 1 9 
Zl / 


1958 


917 
zi / 


01 Q 

Zlo 


1950 


918 


ziy 


1 9£0 


910 


99A 
ZZU 


1 961 
XZOl 


990 

ZZU 


221 


1 9A9 
1ZOZ 


991 
zz 1 


ILL 


1 9A9 


999 

ZZZ 


223 


1 9 A/1 
1Z04 


991 
ZZJ 


224 


1 9A5 


994 
ZZ*f 


225 


1 OAA 

IZoo 


99*; 

ZZj 


226 


1 9/^9 
1ZO/ 


996 
ZZO 


zz/ 


1 9 AS 
lZOo 


997 
ZZ 1 


ZZo 


1 9AQ 


998 

ZZO 


zzy 


1 970 
1Z /U 


990 
ZZ7 


Z3U 


1 971 
1Z / 1 


910 


Lii 


1 979 
1Z/Z 


911 


999 
Z3Z 


1 971 
1Z / J 


919 


Z33 


1 974 
1Z /*f 


911 


91/1 

Z34 


1 97*; 

iZ / J 


914 
z j*+ 


OK 

Z3j 


1 976 
1Z /o 


915. 


OK 

Z30 


1 977 
1Z / / 


916 

ZJU 


9n 

237 


1 998 
IZ/o 


917 ' 
Z3 / 


OIC 

Z3o 


1 97Q 

iz /y 


918 

ZJO 


990 

239 


1 9BO 


910 


Z4U 


1981 


940 


9/1 1 

Z41 


1989 
IZoZ 


941 

Z*T X 


9/19 

Z4Z 


1 981 
1Z5J 


949 

ZtZ 


Z43 


1 984 


941 


9/1 yf 

Z44 


1 985 


94/1 


9/1 < 

Z43 


1 98/; 

XZoO 


945 


9/f /£ 
Z40 


1 987 
IZo / 


946 

ZHU 


9/f n 
Z4 / 


1988 
IZoo 


947 

Z*T / 


9/f C 

Z4o 


1 98Q 


948 

ZtO 


z^y 


19Q0 


940 


9*;n 
ZDU 


1901 


950 


951 
ZD 1 


1909 

izyz 


951 

Z«J 1 


9*19 
ZDZ 


1900. 


9S9 


959 
Ljd 


1 904 


9S1 


95/1 
ZD 4 


1 905 


9S4 


955 

Zjj 


1 90£ 
1ZV0 


o*;s 

ZJ J 


ZjO 


1 909 

izy / 


9*;^ 

ZJO 


959 

ZD / 


1 908 


9S7 


258 


1299 


258 


259 


1300 


259 


260 


1301 


260 


261 


1302 


261 


262 


1303 


262 


263 


1304 


263 


264 


1305 


264 
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SEQ ID NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 60/31 JjZol 


265 


1306 


265 j 


266 


1307 


r s 

266 


267 


1308 


267 ! 


268 


1309 


268 | 


269 


1310 


269 


270 


1311 


270 


271 


1312 


271 


272 


1313 


272 


273 


1314 


273 


274 


1315 


274 


275 


1316 


275 


276 


1317 


276 


277 


1318 


277 


278 


1319 


278 


279 


1320 


ATA 

279 


280 


1321 


280 


281 


1322 


28 1 


282 


1323 


282 


283 


1324 


283 


284 ! 


1325 , 


284 


285 


1326 


285 


286 


1327 


286 


287 


1328 


287 


288 


1329 


288 i 


289 


1330 


289 \ 


290 


1331 


290 


291 


1332 


291 


292 


1333 


292 


293 


1334 


293 


294 


1335 


294 


295 


1336 


295 


296 


1337 


296 


297 


1338 


297 


298 


1339 


AAn 

298 


299 


1340 


inn 

299 


300 


1341 


onn 
300 


301 


1342 


301 


302 


1343 


302 


303 


1344 


303 


304 


1345 


304 


305 


1346 


305 


306 


1347 


306 


307 


1348 


307 


308 


1349 


308 


309 


1350 


309 


310 


1351 


310 


311 


1 1 

1 ijZ, 


j X 1 


312 


1353 


312 


313 


1354 


313 


314 


1355 


314 


315 


1356 


315 


316 


1357 


316 


317 


1358 


317 
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SEQ ID NO of Full-length 
Nucleotide Sequence 


oiLVj -LU inu oi ruu-iengm 
Jrepuue sequence 


QTTO TTI NO in 
Prinrifv Annliration 
TJSSN 60/311.261 


318 




^18 


319 


130U 


J17 J 


320 


1 7/;i 
1301 


^70 


321 


1 7A7 
130Z 


^71 


322 


1 7/^7 \ 
1303 


J^Z 


323 


1364 


jZj 


324 


1 1CS. i 


jZH 


325 


1366 


77*. 
3ZJ 


326 


1367 


776 
3Z0 


327 


1368 


777 
3Z / 


328 


1369 


77R 


329 


1370 


770 

3zy 


330 


1371 


77fi 
33U 


33 1 


1 no 
13 11 


771 
jj 1 


332 


13 ID 


777 
j jZ 


333 


13 / 4 


7^7 


*y n A 

334 


i 77<; 
13 / J 


774 

J J** 


335 


1 776 
13 /0 


J J J 


336 


1 777 
13 / / 




337 


1 178 
13 /O 


jj / 


338 


1 770 
13 /9 


^78 

jjO 


339 


13oU 


jj/ 


340 


1 7fi1 
1351 


340 


341 


138/ 


741 

jHI 


342 


1 7Q7 

13o3 


747 

j*tZ 


343 


1384 


747 
j*f j 


344 


1385 


744 


345 


1386 


74^ 
j4j 


346 


1387 


746 

jHO 


347 


1388 


747 
34/ 


348 


1389 


7451 
340 


349 


1 1DA 

139U 


740 


350 


1 lOI 

1391 


7sn 

jJv 


351 


1 ooo 
1392 


7<\1 
3 j l 


352 


1393 


7^7 
JjZ 


353 


1394 


7^7 
jjj 


354 


1 IOC 

139j 


7^4 

J J*T 


355 


1390 


7^S 

j J j 


356 


13V/ 


3S6 


357 


139o 


7f.7 
jj / 


ICO 

358 


1 700 


358 

JJO 


359 




75Q 
jj? 


360 


1 /IA1 
14U1 


760 

jOv 


361 


14UZ 


761 

jOl 


362 


1403 


767 
jOZ 


363 


1 A(\A 

1404 


767 
jOj 




ItV/J 


364 


365 


1406 


365 


366 


1407 


366 


367 


1408 


367 


368 


1409 


368 


369 


1410 


369 


370 


1411 


370 
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SEQ ID NO of Full-length 
Nucleotide Sequence 


SEQ ID NO of Full-length 
Peptide Sequence 


Mi>v rNi/ in 
Jrriority Appncduuii 
TTQQTV £n/7i 1 M%\ 


371 


1 A 1 O 

1412 j 


j / 1 


372 


1 A 1 O 

1413 


779 
«j /Z 


373 


1 A 1 A 

1414 


777 


374 


1 A 1 < 

141D 


774 


375 


1410 


77 S 


376 


1 A 1 *7 

1417 


77£ 
D /U 


377 


1 jl 1 O 

1418 


777 


378 


1419 


778 
J /o 


379 


1420 


770 


380 


1421 




381 


1422 


7S1 
JOl 


382 


1 AOO 

1423 ! 


789 


383 


1 AO A 

1424 


787 


384 


1425 


78/1 


385 


1 AO£ 

1426 


78S 


386 


1427 


78fi 


387 


1 inn 

1428 


787 
JO / 


388 


1429 


788 


389 


1430 


7RQ 


390 


1431 


7on 


391 


1432 


701 

jyi 


392 


1433 


7Q9 


393 


t AO A 

1434 


707 


394 


1435 


70A ' 


395 


1 AO C 

1436 


70S 


396 


1437 


70A 


397 


1438 


707 


398 


1439 


708 

J70 


399 


1440 


700 


400 


1441 


/inn 


401 


1 A A O 

1442 


4U1 


402 


1 A AO 

1443 




403 


i if A A 

1444 


4H7 
4U3 


404 


^ A AC 

1445 




405 


1446 


/ins 


406 


1 vl /IT 

1447 




407 


1 A A O 

144o 


/in7 


408 


1449 


/108 


409 


1450 


/ino 


410 


1451 


41U 


411 


1 A CO, 

1452 


/11 1 
41 1 


412 


1453 


/II 9 


413 


1454 


/111 
41J 


414 


1455 


A \ A 

414 


415 


1456 


/lis 
41) 


416 


1457 


/ii £ 
410 


41 *7 
41 / 


14SR 

1HJO 


417 


418 


1459 


418 


419 


1460 


419 


420 


1461 


420 


421 


1462 


421 


422 


1463 


422 J 


423 


1464 


423 
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oej\1 JLLl in Ll 01 ruii-iengm 
reptiue oequence 


crfl TT» NO in 
Pnnritv Anntirfltinfl 

TISSN 60/311.261 


424 


1 A AS 


424 


425 


1 AAA 
1400 


49 S 


426 


1 a An 
140/ 




427 


1 ACM 

1468 


497 


428 ! 


1 A ACS 

14oy 


t^o 


429 


1470 


490 


430 


1471 


470 


431 


1472 


471 
hj I 


432 


1473 


479 


433 


1 A"! A 

1474 


477 


434 


1 AHC 

1475 


474 
hjh 


435 


1 Ana 

1476 


47*. 


436 


i Ann 
14// 


476 


437 


1 AHQ 

14 /8 


477 


438 


i /no 

14 /y 


478 


439 


1 A OA 

148U 


470 


440 


1 >I01 

1481 


440 


441 


1482 


441 


442 


1 A Q1 

1483 


449 


443 


1/10/1 

1484 


447 


444 


1483 


444 


445 


1 A QA 

I486 


44 S 


446 


1 /I OT 

1487 


446 

HHO 


447 


1 A OQ 

1488 


hh / 


448 


1 A OA 

1489 


44R 

HHO 


449 


1 /I ft A 

1490 


44Q 
HHy 


450 


1491 


ash 

43U 


451 


1492 


431 _j 


452 


1 A AO 

1493 


43Z 


453 


1 jI t\A 

1494 


4^7 
433 


454 


1 /I AC 

1495 


4^4 

43H j 


455 


1 >4 A^ 

1496 


4SS 


456 


149/ 


4^,6 

HJU 


457 


14V8 


4S7 
h«j / 


458 


14yy 


tJO 


459 


ljUU 


4SQ 


460 


1 cai 
13U1 


460 


461 


1 jUz 


461 


462 


1 CA1 

13U3 


469 
huz 


463 


1 ^A/t 

13U4 


46^ 


464 


1 CAC 

1505 


464 

Hvrr 


465 


1 CA£ 

IjUo 


46S 

HO J 


466 


1 CAT 

130/ 


466 

HOO 


467 


1 C AO 

1508 


467 

HO / 


468 


1 CAA 

1509 


46R 
405 


469 


1 C 1 A 

1510 


AAQ 

40^ ! 


4*7 ft 
4 f\) 


1 S1 1 

1JX1 


470 


471 


1512 


471 


472 


1513 


472 


473 


1514 


473 ] 


474 


1515 


474 


475 


1516 


475 


476 


1517 


476 
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oJbVl N\j 01 1? uii-iengin 
r epnue oequenie 


crn m NO in 
Prinritv Amplication 
USSN 60/311 J61 


477 


1 C1 R 


477 ! 


478 1 


1C1Q 


478 


479 




479 


A OA 

480 


1 591 
XDZ1 


480 


481 i 


1 coo 


481 


inn 

482 


1 590. 
15Z3 


482 


483 


1 Cl/t 
1524 


483 


484 


1 COC 
15Z5 


4R4 


485 


1526 


4R5 


486 


1527 


4Rfi 
*too 


487 


■1 coo 

1528 


4R7 


488 


1 con 
1529 


4RR 

too j 


489 


1 CIA 

1530 


4R9 


490 


1 C1 1 

1531 




49 1 


1 CIO 

153/ j 




492 




492 


493 


1 034 


493 


494 


1535 


494 


a r\ c 

495 


1 5^£ 
1330 


495 


496 


153/ 


496 


497 


153o 


497 


498 


i53y 


498 


499 


154U 


499 1 


500 


1 CA 1 

1541 


soo 


501 


154Z 


501 


502 


1543 




503 


1 C/M 

1544 


503 


504 


1 C/l c 

1545 


504 


505 


1540 


505 


506 


154/ 


JV/U 


507 


1 C/1 o 

154© 


507 


508 


1 C>1A 

1549 


50R 


509 


1 CCA 

1550 


509 


510 


1551 


510 

Jlv 


5ll 


1 ceo 
1552 


51 1 


512 


1 CC*3 

1553 


512 


513 


1 55/1 

1 j54 


513 


C 1 A 

514 


1 555 
1555 


514 


515 


1 CCA 

1550 


515 
jxj 


516 


1 5 5*7 

155 / 


516 


517 


1 55Q 

1535 


517 


518 


1 CCQ 

I55y 


51R 


519 


1 C#CA 

150U 


51Q 


520 


1 C£1 

1561 


590. ! 


521 


150Z 


591 


522 


1563 


599 


593 


1564 


523 


524 


1565 


524 


525 


1566 


525 


526 


1567 


527 


527 


1568 


528 


528 


1569 


529 


529 


1570 


530 
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SEQ ID NO of Full-length 
Nucleotide Sequence 


oJcLI^ JJJ lH\J oi ruii-iengui 


SEO ID NO in 
Priority AoDlication 
USSN 60/311,261 


CO A 

530 


1 571 


531 


CO 1 

531 


1 579 


532 


532 


1 571 


533 


coo 

533 


1 574 


534 


CO A 

534 _i 


1 575 


535 


CO c 

535 


1 576 

1 J /o 


536 


CO c 

536 


1 577 
1 J / / 


537 


COT 

537 


1 578 
ID to 


538 


CO o 

538 


1 57Q 


539 


c*> f\ 

539 


1 5ftft 


540 


C A A 1 

540 


1 5R1 
1 DOl 


541 


C >f 1 

541 


1 5R9 


542 


542 


i 5Ri 

I Do J 


543 


C >IO 

543 


1 5RA ' 


544 . I 


C A A 

544 j; 


1 5R5 


545 


C A C 

545 


1 5R6 
13oO 


546 


546 


1 5R7 

1 JO / 


547 


C vl*7 

547 


1 5RR 


548 ) 


C A O 

548 


1 5RQ 

1J07 


549 


C >t A 

549 j 


1 500 


550 i 


C CA 

550 


1 5Q1 


551 


CC 1 

551 


1 5Q9 


552 


C CO 

552 


1 5Q1 ! 


553 


£ CO 

553 


1 50A 


554 


CCA 

554 


1 505 

l oyj 


555 


c c c 

555 


1 506 


556 


556 


1 507 


557 I 


557 


1 5QR 


558 


558 


1 500 


559 


CCA 

559 


1 Ann 

10UU 


560 


CCA 

560 


1 £01 


561 


561 


1609 
I DIM 


562 


C4TO 

562 




563 


563 


IniU 

iUU*T 


564 


564 


1605 


565 


ccc 

565 


1606 

lOUU 


566 


566 


lUvr / 


567 


567 


1 AOS 


568 


568 


1600 


569 


569 


1610 
lOlU 


570 


COA 

570 


161 1 

lOl 1 


571 


CO 1 

571 


1 £17 


572 


COO 

572 


1 £1 1 


573 


coo 

573 


101** 


574 


574 


1 £1 5 
101 J 


575 


575 


161 £ 
1010 


576 


576 


1617 


577 


577 


1618 


578 


578 


1619 


579 


579 


1620 


580 


580 


1621 


581 


581 


1622 


582 


582 


1623 


583 
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oxLI^ Wj oi ruii-iengtn 
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STCO TT) NO of Full-leneth 
Pentide Senuence 


SEQ ID NO in 
Priority Application 
USSN 60/311,261 


JOJ 


1624 


584 


^R4 


1625 


585 1 


JOJ 


1626 


586 


^R£ 


1627 


587 


SR7 
JO / 


1628 


588 


JOO 


1620 


589 


5RQ 


1630 


590 


jyu ^ 


1631 


591 


CGI 


1619 


592 


COO 

jyz 


1633 

i \j j j i 


593 


CGI 

jyj 


1 614 

1 UJ*t 


594 


CO/1 


1635 


595 




1616 


596 


co£ 
jyo 


1637 


597 


CQ7 
jy / 


1638 


598 


SQR 


1639 


599 


jj/S' 


1640 


600 


600 


1641 


601 


601 


1642 


602 


OUZ ! 


1643 


603 


£01 
OUO 


1644 


604 ! 


£04 
OUH 


1645 


605 


£OC 
OUj 


1646 


606 


£06 
OUO 


1647 


607 


£07 


1648 

lino 


608 


£GR 
OUo 


1640 


609 


£no 


1650 


610 


£1 o 


1651 

X VJ-J X 


611 


£1 1 
01 1 


1659 


612 


01Z 


1653 

X \JmJmJ 


613 


01 o 


1654 


614 


014 


16SS 

1 UJJ 


615 


£1 C 
01 J 


1656 


616 


£1 £ 
010 


1657 

X UJ f 


617 


£17 
01 / 


1658 

X UJO 


618 


£1 R 
Olo 


1659 


619 


£10 
Oil* 


1660 


620 


690 


1661 


621 


691 


1662 


622 


699 
uzz 


1663 


623 


691 

OZJ 


1664 


624 


694 


1665 


625 


69^ 


1666 

X WW 


626 


£9£ 
0Z0 


1667 

X uu / 


627 


£97 
OZ/ 


1668 
X uuo 


628 


69 R 
OZO 


1669 

X \J\Js 


629 


629 


1670 


630 


630 


1671 


631 


631 


1672 


632 


632 


1673 


633 j 


633 


1674 


634 


634 


1675 


635 


635 ' 


1676 


636 
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SEQ ID NO of Full-length 
Peptide Sequence 


SEQ ID NO in 
Priority Application 
USSN 60/311,261 


636 


1677 


637 


637 


1678 


638 


638 


1679 


639 


639 


1680 


640 


640 ! 


1681 


641 


641 


1682 


642 


642 1 


1683 


643 


64^ 


1684 


644 


644 


1685 


645 


64S 


1686 


646 


646 


1687 


647 


647 


1688 


648 


648 


1689 


649 


649 


1690 


650 


6S0 


1691 


651 


6S1 


1692 


652 \ 


65? 


1693 


653 


65^ 


1694 


654 


6*54 


1695 


655 


655 


1696 j 


656 


656 


1697 J 


657 


657 


1698 


658 


658 


1699 


659 


659 


1700 


660 


660 


1701 


661 


661 


1702 


662 


669 


1703 


663 


661 


1704 


664 


664 
out 


1705 


665 


665 


1706 


666 j 


666 


1707 


667 


667 


1708 


668 


668 


1709 


669 


669 


1710 


670 


670 


1711 


671 


671 


1712 


672 


672 


1713 


673 


673 


1714 


674 


674 


1715 


675 


675 


1716 


676 


676 


1717 


677 


677 


1718 


678 


678 


1719 


679 1 


679 


1720 


680 


680 


1721 


681 


681 


1722 


682 


682 


1723 


683 


683 


1724 


684 


684 


1725 


685 


685 


1726 


686 


686 


1727 


687 


687 


1728 


688 


688 


1729 


689 
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689 


1730 


690 


690 


1731 


691 


691 1 


1732 


692 


692 


1733 


693 


693 


1734 


694 J 


694 


1735 


695 • 


695 


1736 


696 


696 ! 


1737 


697 


697 ! 


1738 


698 


6QR 

070 


1739 


699 




1740 


700 




1741 


701 


701 


1742 


702 


70? 


1743 


703 


703 


1744 


704 


704 


1745 


705 


70S 


1746 


706 


706 


1747 


707 


707 


1748 


708 


708 

/ v/o 


1749 


709 


709 


1750 


710 


710 


1751 


711 


71 1 


1752 


712 


719 


1753 


713 


713 

/ ID 


1754 


714 


714 


1755 


715 


71 5 


1756 


716 


716 


1757 


717 


717 


1758 


718 


718 

/AO 


1759 


719 


719 


1760 


720 


790 


1761 


721 


791 


1762 


722 


79? 


1763 


723 


793 


1764 


724 


794 


1765 


725 


725 


1766 


726 


726 


1767 


727 


727 


1768 


728 


728 


1769 


729 


1?Q 


1770 


730 


730 


1771 


731 


731 


1772 


732 


739 


1773 


733 


733 


111 A 


734 


734 


1775 


735 


735 


1776 


736 


736 


1777 


737 


737 


1778 


738 


738 


1779 


739 


739 


1780 


740 


740 


1781 


741 


741 


1782 


742 * 
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/4Z 


1783 


743 


"7/1 1 


1784 


744 ^ 


HA A 

/44 


1785 


745 


/4S 


1786 


746 




1787 

A / O / 


747 J 


747 


17RR 


748 


*7 AO 

748 ! 


17R9 


749 


749 


1700 


750 


750 


1701 
1 ly I 


751 


751 


170? 


752 


752 


i7cn 


753 


753 


1704 
a /y*t 


754 


754 


179S 

A / 7J 


755 


/jj 


1796 


756 


7C/£ i 

75o 


17Q7 

1 IT? 1 


757 


757 


1708 

A / 270 


758 


/5o 


1709 
a / yy 


759 


7CO 


1800 
low 


760 


/oU 


1 801 ! 

A OU A 


761 


/ol 


1802 

A Ova. , 


762 


/oz 


1803 


763 


7/£"2 

/o3 


1804 

A OU*t 


764 


/04 


180^5 

A OUJ 


765 


/Oj 


1806 


766 


/oo 


1807 

A OU / 


767 


767 


1R0R 

AOUO 


768 


7oo 


1R0Q 


769 


769 


AOlU 


770 1 


770 


1R1 1 

AO 1 1 


771 


771 


1 R17 
loiz 


772 


772 


1 an 

lOlJ 


773 


773 


1R14 

AOlH 


774 


774 


1R1 S 

AO A J 


775 


775 


1R1 6 

AO A U 


776 


776 


1R17 

A O A / 


777 


III 


1818 

A O A O 


778 


/ /o 


1819 

I O I .7 


779 


770 


1820 


780 


/ou 


1821 


781 


7Q1 
/Ol 


1822 

IDA* 


782 


TOO 

/oz 


1823 


783 j 


/o3 


1R24 . 
io**t 


784 


TO/1 
/o4 


1825 


785 ! 


7B<i 


1826 


786 


/OO 


1R77 


787 j 


/O / 


1828 


788 


788 


1829 


789 


789 


1830 


790 


790 


1831 


791 


791 


1832 


792 


792 


1833 


793 


793 


1834 


794 


794 


1835 


795 
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CVA TT> NO in 
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705 


1836 


796 




lOJ / 


797 


7Q7 


1838 


798 


70S 

/yo 


1830 

10J7 


799 


700 


1840 

A OTV/ 


800 


oaa 

oUU 


1841 


801 


QA1 


1 849 


802 


0A7 

ovZ 


1843 


803 


oai 


1844 
10***+ 


804 

OUT 




1 845 
10HJ 


SOS 




1 846 
10*tO 


806 

OV/vJ 


OA/C 

oUo 


1 847 
10*t / 


807 


OAT 

ou/ 


1 848 


808 


QAQ 

oUo 


1 840 
10*fy 


800 




1 850 


810 


01 A 

oil) 


1 851 

lOJ I 


811 


01 1 
ol 1 


1 857 

10 


812 


01 7 

olZ 


1 853 

1 OJj 


813 


01 3 


1 854 

lOJt 


814 


01 zl 


1 855 

1 0 


815 


01 5 
OlD 


1 856 

1 OJU 


816 


01 A 

olo 


1 857 


817 


Q1 7 


1 858 
lOJO 


818 

O 1 o 


01 c 

olo 


1 850 

iojy 


819 

O 1 7 


01O 

oiy 


1860 

iOOVJ 


820 


OTA 
oZU 


1861 

10U1 


821 


5Z1 


1867 

lOUZ 


822 


077 
OZZ 


1863 


823 


073 

oZo 


1 864 


824 


07/1 
0Z4 


1865 


825 


07^ 
oZD 


1866 

lOOU 


826 


076 
oZO 


1867 

lOU / 


827 


Q77 
oZ/ 


1868 

lOUO 


828 


878 
OZo 


1860 

10U7 


829 


07O 

ozy 


1870 
10 /U 


850 


Q3A 


1871 
lO / 1 


831 


831 


1 877 

lO / <L 


832 


037 

OjZ 


1 873 

lO / J 


833 


013 


1874 
1 o /*+ 


834 ! 


0O4 


1875 

lO/ J 


835 


03^ 

ooj 


1 876 

10 f\J 


836 


03A 
030 


1877 
lO / / 


837 


on 
oi / 


1 878 
10/0 


838 


o3o 


1 870 
10 /y 


830 


oiy 


1 88A 
100U 


840 

0*TV 


0/1A 
o4U 


1881 
100 1 


841 
o*t 1 


841 


1882 


842 j 


842 


1883 


843 


843 


1884 


844 


844 


1885 


845 


845 


1886 


846 


846 


1887 


847 1 


847 


1888 


848 
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848 


1889 


849 


849 


1890 


850 


850 


1891 


851 


851 


1892 


852 


852 


1893 


853 


853 


1894 


854 ! 


854 


1895 


855 


855 


1896 


856 


856 


1897 


857 


857 

OJ / 


1898 


858 


858 


1899 


859 


859 


1900 


860 


860 


1901 


861 


861 


1902 


862 | 


862 


1903 


863 


863 


1904 


864 


864 


1905 


865 


865 


1906 


866 


866 


1907 


867 


867 


1908 


868 


868 


1909 


869 


869 


1910 


870 


870 


1911 


871 


871 


1912 


872 


872 


1913 


873 


873 


1914 


874 


874 


1915 


875 


875 


1916 


876 


876 


1917 


877 


877 

Off 


1918 


878 


878 


1919 


879 


879 


1920 


880 1 


880 


1921 


881 


881 


1922 


882 


882 


1923 


883 


883 


1924 


884 


884 


1925 


885 


885 


1926 


886 


886 


1927 


887 


887 


1928 


888 


888 


1929 


889 


889 


1930 


890 


890 


1931 


891 


891 


1932 


892 


892 


1933 


893 


893 


1934 


894 


894 


1935 


895 


895 


1936 


896 


896 


1937 


897 


897 


1938 


898 


898 


1939 
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1940 


900 
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1941 
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901 


1942 


902 


902 


1943 


903 


903 


1944 


904 


904 


1945 


905 


905 


1946 


906 


006 


1947 1 


907 


007 


1948 


908 


008 


1949 


909 




1950 


910 


010 


1951 


911 1 


01 1 

"11 


1952 


912 




1953 


913 


on 

7lJ 


1954 


914 


014 
/it 


1955 


915 


915 


1956 


916 


916 


1957 


917 


917 


1958 


918 


918 


1959 


919 


919 


1960 


920 


920 


1961 


921 


921 


1962 


922 


922 


1963 


923 


923 


1964 


924 


924 


1965 


925 


925 


1966 


926 


096 


1967 


927 I 


097 


1968 


928 


098 


1969 


929 


929 

J A* J 


1970 


930 


030 


1971 


931 


011 


1972 


932 


012 


1973 


933 


013 


1974 


934 


934 


1975 


935 


935 


1976 


936 


936 


1977 


937 


937 


1978 


938 


938 


1979 


939 


939 


1980 


940 


940 


1981 


941 


941 


1982 


942 


942 


1983 


943 


943 


1984 


944 


944 

SIT 


1985 


945 


045 


1986 


946 


946 

7tU 


1987 


947 


947 


1988 


948 


948 


1989 


949 


949 


1990 


950 


950 


1991 


951 


951 


1992 


952 


952 


1993 


953 


953 


1994 


954 
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954 


1995 


955 


955 


1996 


956 


956 


1997 


957 


957 


1998 


958 


958 


1999 


959 


959 


2000 


960 


960 


2001 


961 


961 


2002 


962 


962 


2003 


963 


963 ! 


2004 


964 


964 


2005 


965 


965 


2006 


966 


966 


2007 


967 


967 


2008 


968 


968 


2009 


969 


969 


2010 


970 


970 


2011 


971 


971 


2012 


972 


972 


2013 


973 


973 


2014 


974 


974 


2015 


975 


975 


2016 


976 


976 


2017 


977 


977 


2018 


978 


978 


2019 


979 


979 


2020 


980 


980 


2021 


981 


981 


2022 


982 


982 


2023 


983 


983 


2024 


984 


984 


2025 


985 


985 


2026 


986 


986 


2027 


987 


987 


2028 


988 


988 


2029 


989 


989 


2030 


990 


990 


2031 


991 


991 


2032 


992 


992 


2033 


993 


993 


2034 


994 


994 


2035 


995 


995 


2036 


996 


996 


2037 


997 


997 


2038 


998 


998 


2039 


999 


999 


2040 


1000 


1000 


2041 


1001 


1001 


2042 


1002 


1002 


2043 


1003 


1003 


2044 


1004 


1004 


2045 


1005 


1005 


2046 


1006 


1006 


2047 
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1007 


2048 


1008 


1008 


2049 


1009 


1009 


2050 


1010 


1010 


2051 


1011 


1011 


2052 


1012 


1012 


2053 


1013 


1013 


2054 


1014 


1014 


2055 


1015 


1015 


2056 


1016 ! 


1016 


2057 


1017 


1017 


2058 


1018 


1018 i 


2059 


1019 


1019 


2060 


1020 


1020 


2061 


1021 


1021 


2062 


1022 


1022 


2063 


1023 


1023 


2064 


1024 | 


1024 


2065 


1025 


1025 


2066 


1026 


1026 


2067 


1027 


1027 


2068 


1028 


1028 


2069 


1029 


1029 


2070 


1030 


1030 


2071 


1031 


1031 


2072 


1032 


1032 


2073 


1033 


1033 


2074 


1034 


1034 


2075 


1035 


1035 


2076 


1036 


1036 


2077 


1037 


1037 


2078 


1038 


1038 


2079 


1039 


1039 


2080 


1040 


1040 


2081 


1041 


1041 


2082 


1042 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-1041. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein 
said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein 
said polynucleotide has greater than about 99% sequence identity with the polynucleotide of 
claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in th.e host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting 
of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; 
and 

(b) a polypeptide encoded by a polynucleotide hybridizing under 
stringent conditions with any one of SEQ ID NO: 1-1041 . 
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11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 



complex with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 



15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 



a) 



contacting the sample with a compound that binds to and forms a 




amplifying a product comprising at least a portion of the 



detecting said product and thereby the polynucleotide of claim 1 in the 
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a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex 
is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, 
under conditions sufficient to form a polypeptide/compound complex, wherein the complex 
drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, 
so that if the polypeptide/compound complex is detected, a compound that binds to the 
polypeptide of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of any of the polynucleotides from SEQ ID NO: 1-1041, under 
conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ID NO: 1042-2082. 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising of at least one of 
SEQ ID NO: 1-1041. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 
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25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 



